Exploring Kubernetes Provisioning using Terraform on the Big 3 Clouds

Kubernetes provisioning

Omer Mishania

reading time: 

9 minutes

Provisioning and managing your environment’s infrastructure when it’s full of components and configurations can be frustrating, especially if it’s done through manual processes and you need to deploy a variety of similar environments with a few changes in each one. 

Luckily, Infrastructure as Code (IaC) can help resolve the problem. 

This is especially true when it comes to Kubernetes (K8S), as it is an entire ecosystem on its own. 

Since this article is meant for both beginners and experienced DevOps practitioners, we’ll start with a brief background. If you do not need the intro on IaC, K8S, and Terraform, feel free to skip directly to the technical part. 

What is Infrastructure as Code? 

Infrastructure as Code is the process of managing and provisioning infrastructure through code instead of through manual processes with interactive configuration tools.  

IaC allows you to build, change, manage, and monitor your infrastructure safely, consistently, and repeatably by defining resource configurations that you can version, reuse, and share. With IaC, configuration files contain your infrastructure specifications, making it easier to edit configurations and guarantees that the same environment will be provisioned every time. 

What is Kubernetes? 

Kubernetes is an open-source platform made to deploy and manage containerized applications on one or more hosts. With K8S, operational tasks of container management like deploying a new application, rolling out a new version, auto-scaling application replicas, or monitoring the application’s activity are automated and much easier to manage. 

What is Terraform? 

Terraform is an IaC tool developed by HashiCorp. It defines the resources and infrastructure in human-readable, declarative configuration files and manages the infrastructure’s lifecycle. 

The configuration files are written in a JSON-like language called HCL (HashiCorp Configuration Language). 

HCL’s simple syntax makes it easy for DevOps teams to provision and re-provision infrastructure across multiple cloud providers and on-premises data centers. 

You can read more about Terraform Here 

Provisioning a K8S cluster using terraform on the big 3 Cloud providers 

Although there are several cloud providers on the market, for this article, we’ll focus on the three major vendors – AWS, Google Cloud, and Microsoft Azure. We’ll provision a managed single node k8s on each using Terraform. 

We’ll explore the complexity of provisioning Kubernetes using Terraform on each of the cloud providers and we’ll perform a short comparison to help you make the decision. 

To be as concise as possible, we’ll use some code prepared in advance, and focus on the key differences. 

Clone this Git Repository containing configuration files for basic k8s cluster provisioning on AWS, Azure, and GCP. 

clone https://github.com/omermishania/deploy-clusters-using-terraform-on-the-3-big-clouds.git

K8S provisioning using Terraform on AWS 

Walkthrough introduction 

You’ll need five files to provision K8S using Terraform on AWS: 

  1. vpc.tf – defines virtual private cloud (VPC), subnets, and availability zones for the EKS cluster in a selected region using the AWS VPC Module. 
  2. eks-cluster.tf — defines the resources and basic configurations required to set up an EKS cluster such as the cluster’s version and worker groups inside a selected VPC using the AWS EKS Module. 
  3. security-groups.tf – defines the security groups used by the EKS cluster.
  4. outputs.tf – defines the output that will prompt when executing the provisioning plan. 
  5. versions.tf – sets the required Terraform version and providers required versions. 

Some of the resources that will be created are a VPC, subnets, a nat gateway, a security group, an EKS and a node group. AWS’s basic Terraform’s k8s modules offer wide networking configuration options like configuring the cluster itself, node group, security group rules, different IAM policies, autoscaling group, and more. 

If you want to use multiple worker nodes, simply change the number of nodes in your node group in the eks-cluster.tf file. You can deploy up to 30 managed node groups per cluster and 450 nodes per node group (a total of 13500 nodes per cluster). If you want to autoscale your worker nodes count in case it is needed, you can use the aws_autoscaling_group resource. 

The total average provisioning time is about 11 minutes, and the total average time to destroy all resources is about 3 minutes. 

Walkthrough prerequisites: 

Walkthrough 

  1. From the Git Repo directory, Change directory to AWS’s one



    cd aws 


  2. Edit the vpc.tf file



    vim vpc.tf 


    Change the region variable value to the desired region of the cluster that you are going to deploy 

  3. Initialize the project modules

    terraform init 


  4. Plan the provisioning

    terraform plan


  5. Execute the provisioning plan

    terraform apply


  6. Update the local Kube config file with access credentials for your cluster

    aws eks kube-config --region <region> --name <cluster-name>


K8S provisioning using Terraform on GCP 

Walkthrough introduction 

You need five files to provision K8S using Terraform on GCP: 

  1. vpc.tf defines a VPC and subnet for the GKE cluster using the GCP Provider. 
  2. gke.tf defines a GKE cluster, a managed node pool, and defines the number and size of the cluster’s worker VMs using the GCP Provider 
  3. terraform.tfvars declares project_id and region as variables so Terraform can use it in its configuration. 
  4. outputs.tf defines the output that will prompt when executing the provisioning plan. 
  5. versions.tf sets the required Terraform version and providers required versions. 

Some of the resources you’ll create include a VPC, a subnet, a GKE, and a node pool. GCP’s basic Terraform’s k8s modules offer limited networking configuration options like configuring the cluster itself, VPC, subnet, cluster autoscaler, and node pool. 

If you want to increase the number of worker nodes, simply change the gke_num_nodes variable value to the desired amount of workers in the gke.tf file. You can use up to 5,000 nodes for GKE in versions 1.17 and below, and 15,000 for GKE in versions 1.18 and above. If you want to autoscale your worker nodes count in case it is needed, you can do it by adding a cluster_autoscaling block under the google_container_cluster resource and configuring it to fit your needs. For example: 

cluster_autoscaling {
    enabled = true 
    resource_limits { 
    resource_type = "cpu"
    minimum   = 1
    maximum   = 10

    }
}

The total average provisioning time is about 8 minutes, and the total average time to destroy all resources is about 8 minutes. 

Walkthrough prerequisites 

Walkthrough 

  1. On your GCP Web Console, add compute admin role 
  2. From the Git Repo directory, Change directory to GCP’s one



    cd gcp


  3. Edit the terrform.tfvars file

    vim terraform.tfvars


    Change the project_id and region variables values to the ID of the GCP’s project and region of the cluster that you are going to deploy 

  4. Initialize the modules

    terraform init


  5. plan the provisioning

    terraform plan


  6. Execute the provisioning plan

    terraform apply


  7. Update the local kube config file with the access credentials for your cluster

    gcloud container clusters get-credentials $(terraform output -raw kubernetes_cluster_name) 
    --region $(terraform output -raw region) 


K8S provisioning using Terraform on Azure 

Walkthrough introduction 

You’ll need five files to provision K8S using Terraform on Azure: 

  1. aks-cluster.tf defines a resource group, an AKS cluster in that resource group, and defines the number and size of the cluster’s worker VMs using the Azure AKS Module. 
  2. terraform.tfvars defines the Azure appId and password variables to authenticate Azure. 
  3. variables.tf declares the Azure appID and password as variables so Terraform can use it in its configuration. 
  4. outputs.tf defines the output that will prompt when executing the provisioning plan. 
  5. versions.tf sets the required Terraform version and providers required versions. 

Some of the resources you’ll create include a Resource Group and an AKS. Azure’s basic Terraform’s k8s module offers the least networking configuration options among the three vendors we’re looking at (with additional Azure modules almost everything is configurable). You can configure the cluster itself, autoscaler profile, and node pool. 

If you want to increase the number of worker nodes, simply edit the aks-cluster.tf file and change the node_count variable value in the node pool to the desired amount of workers or add another node pool. You can use up to 1000 nodes for each AKS (across all node pools). If you want to autoscale your worker nodes count in case it is needed, you can do it by adding an auto_scaler_profile resource and configuring it to fit your use case (if you do so, don’t forget to use Terraform’s ignore_changes functionality to ignore changes to the node_count field). 

The total average time of the provisioning is about 3 minutes, and the total average time to destroy all resources is about 7 minutes. 

Walkthrough prerequisites 

Walkthrough 

  1. From the Git Repo directory, Change directory to Azure’s one



    cd azure


  2. Create an Active Directory service principal account on Azure

    az ad sp create-for-rbac --skip-assignment


    the output should look like this:



    {
      "appId": "your-app-id-should-go-here",
      "displayName": "azure-cli-current-date-hour",
      "name": "http://azure-cli-current-date-hour",
      "password": "your-pwd-should-go-here",
      "tenant": "a1b2c3d4-a1b2-a1b2-a1b2-a1b2c3d4e5f6"
    }


    In this tutorial, we’ll only need to apply the appID and password values. Copy and save yours, you’ll need it for the next step

  3. Edit the terraform.tfvars file

    vim terraform.tfvars


    Update the values of the appID and password variables to those you just copied

  4. Initialize the project modules

    terraform init


  5. Plan the provisioning

    terraform plan


  6. Execute the provisioning plan

    terraform apply


  7. Update the local Kube config file

    az aks get-credentials --resource-group $(terraform output -raw resource_group_name) --name $(terraform output -raw kubernetes_cluster_name)


Conclusion

No matter what cloud platform you are using, Terraform can significantly improve the management of your infrastructure’s lifecycle.
You can provision a Kubernetes cluster for various reasons – development environment, production workloads, stateless workloads, stateful workloads, a remote developer’s environment (instead of a local one), or even a temporary environment for testing an application on K8S clusters with different configurations.

Each provisioning can be better for some use cases, but if it’s important for you to provision and destroy a cluster within a CI process you might consider using a cloud provider that does that more quickly. If you need to deploy multiple clusters on different environments, using a cloud provider that doesn’t require any additional actions except the Terraform provisioning itself can be a good option, while if what’s important for you is having a high-scale production system you’ll check where the scale is less limited.

Omer Mishania

DevOps Engineer at MeteorOps, is obsessed with building elegant and simple DevOps solutions. Experienced with building platforms for engineering teams. Focused on finding ways to make provisioning, deployment, orchestration, monitoring, and much more, accessible for development teams.