Kubernetes from scratch on AWS with Infrastructure by Terraform

Kubernetes from scratch on AWS

Michael Hannecke

Kubernetes Cluster Setup from Scratch on AWS with Terraform for Learning and Testing

Kubernetes is a powerful container orchestration platform that can be used to deploy, scale, and manage containerized applications. However, setting up Kubernetes can be a complex task. In this blog post, we will walk you through the process of setting up an unmanaged Kubernetes cluster on AWS using Terraform.

Terraform is an Infrastructure as Code (IaC) tool that allows you to automate the creation, deployment, and management of infrastructure resources. In this case, we will use Terraform to create a Kubernetes cluster on AWS.

What are the benefits of setting up Kubernetes from scratch?

There are some benefits to setting up Kubernetes from scratch. First, it gives you complete control over the cluster. You can choose the underlying infrastructure, the networking configuration, and the security settings. Second, it is a good learning experience, especially if you’re planning to undergo official Kubernetes certification like CKA or CKAD. By setting up Kubernetes from scratch, you will gain a deeper understanding of how the platform works.

What are the prerequisites for this tutorial?

To follow this tutorial, you will need the following:

An AWS account with proper rights to configure VPC , Security groups, Route tables and AWS Instances
Terraform CLI installed on your device
SSH configured on your device. For this tutorial I assume your public ssh key is in ‘ ~/.ssh/id_rsa.pub’
A basic understanding of Kubernetes, AWS and Terraform will be helpful as well knowledge about basic linux shell usage.

This tutorial is divided into several steps. Each step has detailed instructions on how to complete the task. I’d recommend to follow all steps in the given order to ensure the cluster will come up as planned.

Disclaimer

This tutorial is for learning and testing purposes only. It is not an ideal setup for production workloads. For production or even commercial use cases, it is recommended to use a managed Kubernetes environment.

Furthermore running the infrastructure on AWS will incur costs on your aws account — make sure to destroy everything once you’re done with testing.

For this tutorial we’ll have all scripts in a dedicated folder, so if you’re run a ‘terraform destroy’ in this folder, everything you’ve setup following this tutorial will be deleted.

Always check before you go to avoid unnecessary costs.

Let’s dive in

1. Terraform Initialization

To enable terraform managing AWS infrastructure for you, you should have the AWS cli installed and configured on your device by running

in a terminal. At least the following environment variables should be set:

Create a new subfolder and place all upcoming files in this folder.

main.tf

At first create a file named main.tf and place the the AWS provider information in:

variable.tf

Next, create a file for all variable definition. We will use variables for the Region the infrastructure will be deployed to (“k8-region”), The CIDR for our VPC and Subnet (“k8-vpc-cidr”, “k8-subnet-cidr”), the external IP from which we will allow access to the cluster — this should be the external IP of your environment (“external-ip”). You can get this info easily by :

curl ipinfo.io

Furthermore we use a variable to define the instance typ for the master and worker nodes (“instance_type”). Last but not least, a variable for the amount of deployed workers (“workers-count”).

The variables file should look like this:

terraform.tfvars

To initialize the variables above, create a file terraform.tfvars and put in values fitting to your setup:

outpupt-tf

To interact with the kubernetes nodes via ssh, we need to know the assigned IP addresses. Terraform can output these values after the infrastructure had been deployed to AWs. Therefore create a file named output.tf and place the following code in:

2. Network Configuration

We now will define the base network environment. For our use case we will place all nodes within one subnet. For ease of the setup the following rules will be implemented:

All nodes and master will be with in the same subnet
All VMs can communicate with each other unrestricted. For more security you should limit the communication to the required ports kubernetes requires, but for this setup we will skip that.
All VMs can reach the internet for updates and download of required software packages. This should also be limited down in any production environment, but is out of scope here.
The VMS will be reachable via SSH from your external IP address only. For a more secure approach, all VMs should be placed in a private subnet and only be reachable via some bastion host or similar, but again, for the sake of our test environment I’ll keep it more simple.

networks.tf

Create a file named networks.tf and copy the following code into it:

This definition will create a couple of resources:

Resource “aws_vpc” “vpc_kubernetes”
The main VPC
Resource "aws_internet_gateway" "igw-kubernetes"
An internet gateway in our VPC to enable communication to the internet
Resource “aws_subnet” “subnet1”
This will create the subnet for all of our workloads. As this subnet will be created in a availability zone within our region, we’re using a data provider (“aws_availability_zones”) that will provide a list of available AZs; the subnet then will be created in the first AZ.
Resource “aws_route_table” “kubernetes-internet-route”
This route table will tell the nodes how to reach the internet.
Resource “aws_main_route_table_association” “kubernetes-set-rt-to-vpc”
The route has to be associated with the VPC to be active

securitygroups.tf

After base network definition is done, we have to provide needed firewall rules, in AWS called security groups. We need a file securitygroups.tf with following contend:

3. Master and Worker Nodes

instances.tf

The stage is set now, time to bring in the actors. Create a file instances.tf within the master and worker nodes are beeing defined as follows:

This file mainly come with three parts:

At first terraform will create a key pair for the instances based on the public key
Second, definition of the master node. For this test environment the same instance type will be used for both the master and the worker nodes.
Third the deployment of multiple worker nodes, defined by the corresponding variable.

Important for the deployment is the used AWA AMI image. For this tutorial we’re using Ubuntu 20.04 LTS as the base image. You can get the valid AMI-ID for your region from this website. Be careful to adapt as needed, the ami-id within the above script may NOT be available in your selected region.

There are other ways to identify an valid AMI id during run time, maybe I’ll write a dedicated post about that approach later on. for now we’ll go with the more ‘manual’ approach.

The last part is about the definition of the workers The count variable here will inform terraform to setup up the required count of workers you’ve defined in the terraform.tfvars file.

startup-master.sh

To install the designated master note we have to provide a startup-script, which will be executed one the VM will startup initially. Provide a file called “startup-master.sh” with the following content. Be aware that. the naming must match with the filename provided in the resource definition for the master node above (user_data = file(“startup-worker.sh”)”!

This script will configure the master VM for kubernetes, start the master node inside and install a network plugin. We’ll go with the calico plugin.

To Install the calico plugin we’ll call inside the setup script:

kubect alpply -f \ https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml

It would be possible to install this directly on the master node as part of the kubernetes administration, but I would like to automate that step a bit further.

The worker nodes require a slightly different start-up script. Create a file named “startup-worker.sh” Again, ensure that the name exactly fits the user_data definition in the worker node resource definition in instances.tf.

startup-worker.sh

The following script will configure kubernetes on a worker node:

4. Startup

With everything above in place, we are now ready to ignite the rocket.

Run a

to allow terraform to download the required provider. For this tutorial we store the state file locally — if you want to have a remote state, please the my post about remote state file definition:

Terraform Remote State on AWS

Next, run “terraform fmt” and “terraform validate” to optimize formatting and ensure that the configuration is understandable for terraform. If anything is exactly as provided above there should be nor errors and we now can start the deployment:

This will take a couple of minutes to perform. After terraform has finished the deployment it will output the IP addresses of the deployed instances in the terminal.

5. Add the worker nodes to the cluster

You should now have a master node and one ore more worker nodes up and running, but there still some additional steps required to join the worker nodes in kubernetes to the cluster:

Wait a couple of minutes as the installation inside the nodes take a bit (2 to 3 minutes should be sufficient) and then connect via ssh to your master node:

Once logged in to the master node, it should look something like this:

We now can check, if the calico plugin was deployed successful (it may take a bit for the pods to start up), but soon there should be two calico pods and the coredns pods should be in running state as well:

Now we have to get the parameter to join the nodes to our kubernetes cluster. On the master node, carry out:

sudo kubeadm token create --print-join-command

Copy the output to clipboard, we will need exact this command with the keys of your kubernetes cluster on each node:

Back on thre master node, check that each node has successfully been added:

kubectl get nodes

For more detailed information:

kubectl get pod -A

Now you have a functional kubernetes cluster ready for your container deployments

Well Done!

6. Terraform destroy

Once your done with testing, do not forget to run a

terraform destroy

in the same folder all the above configuration is held, this will destroy all nodes, the network configuration and firewall setup and avoid additional costs!

7. Conclusion

Congratulations on setting up your unmanaged Kubernetes cluster on AWS using Terraform! This gives you full control over your cluster configuration and management. To ensure the security and performance of your cluster, it is important to keep your cluster and its components updated. You may also need to consider additional aspects like logging, monitoring, high availability, and backup in a production environment.

I hope this tutorial has been helpful. For more information on Kubernetes, please visit the following resources:

Kubernetes documentation: https://kubernetes.io/docs/home/

Terraform documentation: https://www.terraform.io/docs/

AWS documentation on Kubernetes: https://docs.aws.amazon.com/eks/latest/userguide/

Happy container orchestration! 🚀

Time to get a cofffee.

< Older Post

Newer Post >

Kubernetes from scratch on AWS

Kubernetes Cluster Setup from Scratch on AWS with Terraform for Learning and Testing

Key Logging Attack on LLM

Deploy GPU ready Kubernetes on Google cloud with Terraform

Responsible AI

Best Practices for secure AI development

Contact

Social Networks