One of the first topics I want to touch on is scaling workloads in Kubernetes. This is a very important concept in the world of Kubernetes that addresses the need to handle increased workloads and distribute traffic accordingly.
Before we jump in and start scaling workloads on Kubernetes, it’s best to quickly stop by the basics: Pods. A Pod is the smallest and simplest unit in Kubernetes. It represents a single instance of a running process within the cluster. A pod can contain one or more containers, storage resources, a unique IP address, and more.
Pods are designed to be ephemeral and disposable. They can be created, scheduled, and destroyed dynamically based on the needs of an application or the workload. Creating a pod can be done quickly by creating a simple YAML file containing a Pod definition.
apiVersion: v1 kind: Pod metadata: name: pod spec: containers: - name: container image: image:tag ports: - containerPort: 80
One of the primary approaches for scaling applications is by utilizing a ReplicaSet. In Kubernetes, a ReplicaSet is a resource that ensures a specified number of identical Pods (replicas) are running at all times. It helps you maintain the desired level of availability and scalability for your application.
Think of a ReplicaSet as a supervisor monitoring your application and making sure that a desired number of replicas are always up and running. If a Pod fails or gets deleted, the ReplicaSet jumps in and creates a new Pod to replace it.
If one of the Pods becomes unresponsive or is deleted, the ReplicaSet automatically creates a new Pod to maintain the desired replica count. Similarly, if you want to scale up your application to handle increased traffic, you can update the replica count in the ReplicaSet, and it will create additional Pods accordingly.
ReplicaSets use labels and selectors to identify the Pods they manage. Each Pod controlled by the ReplicaSet must have a label that matches the selector specified in the ReplicaSet configuration.
Enough theory, let’s build a ReplicaSet for our application and deploy it to our Kubernetes cluster! Create a Kubernetes manifest like you can see below, let’s call the file my-super-cool-replicaset.yaml.
apiVersion: apps/v1 kind: ReplicaSet metadata: name: replicaset spec: replicas: 3 selector: matchLabels: app: my-cool-application template: metadata: labels: app: my-cool-application spec: containers: - name: container image: image:tag ports: - containerPort: 80
Now that we’ve got our Kubernetes manifest, we can apply it using the
kubectl apply command.
[vdeborger@node-01 ~]$ kubectl apply -f my-super-cool-replicaset.yaml replicaset.apps/replicaset created
We should now be able to see our newly created Pods!
[vdeborger@node-01 ~]$ kubectl get pods --selector=app=my-cool-application NAME READY STATUS RESTARTS AGE replicaset-2qnm4 1/1 Running 0 7s replicaset-6f9jm 1/1 Running 0 7s replicaset-t62hg 1/1 Running 0 7s
We’ve got 3 Pods running, that’s good, that’s what we defined in our
my-super-cool-replicaset.yaml manifest file. Let’s change that and see what happens. In the
my-super-cool-replicaset.yaml file, change the value of the “replicas” field to 2 and apply it using
[vdeborger@node-01 ~]$ sed -i 's/\(.*replicas:.*\)/ replicas: 2/g' my-super-cool-replicaset.yaml [vdeborger@node-01 ~]$ cat my-super-cool-replicaset.yaml| grep replicas: replicas: 2 [vdeborger@node-01 ~]$ kubectl apply -f my-super-cool-replicaset.yaml replicaset.apps/replicaset configured
You should now only see 2 replicas when retrieving the pods with the “app=my-cool-application” label.
[vdeborger@node-01 ~]$ kubectl get pods --selector=app=my-cool-application NAME READY STATUS RESTARTS AGE replicaset-6f9jm 1/1 Running 0 17m replicaset-t62hg 1/1 Running 0 17m
Yay, cool right?!
The second option we can use to scale applications is a Deployment. Think of a Deployment in Kubernetes as a ReplicaSet with a couple of very handy features strapped to it. It helps you ensure that your application is always available, can scale easily, and can be updated without any downtime. That last part is where it differentiates itself from a ReplicaSet. While ReplicaSets can create multiple Pods of the same kind and keep them online, Deployments have the possibility to utilize a rolling update process when executing updates and can roll back broken updates if needed.
When working with a Deployment, you define the desired state of your application. This way, you can define the number of replicas you want to run, and the Deployment controller will make sure that these replicas are in a running state. It takes care of creating and managing the necessary ReplicaSets and Pods to match that desired state.
Now, let’s take a look at an example. Let’s say you want to run an application that has 3 replicas. We’ll use the Pod definition from the beginning of this post as a base.
apiVersion: apps/v1 kind: Deployment metadata: name: deployment spec: replicas: 3 selector: matchLabels: app: my-cool-application template: metadata: labels: app: my-cool-application spec: containers: - name: container image: image:tag ports: - containerPort: 80
Maintaining the desired amount of replicas
This will set up a Deployment with 3 replicas (pods) of our application. The Deployment controller will try to keep these 3 replicas online. You can test this by deleting one of the Pods.
First, we’ll need a list of the replicas created by the Deployment:
[vdeborger@node-01 ~]$ kubectl get pods --selector=app=my-cool-application NAME READY STATUS RESTARTS AGE deployment-767b969b79-429n7 1/1 Running 0 6s deployment-767b969b79-gzzjn 1/1 Running 0 6s deployment-767b969b79-hrtjd 1/1 Running 0 6s
We can then go ahead and delete one:
[vdeborger@node-01 ~]$ kubectl delete pods deployment-767b969b79-429n7 pod "deployment-767b969b79-429n7" deleted
If you’re quick enough, you should see a new container being created as the Deployment controller tries to maintain the desired state.
[vdeborger@node-01 ~]$ kubectl get pods --selector=app=my-cool-application NAME READY STATUS RESTARTS AGE deployment-767b969b79-c7d9m 0/1 ContainerCreating 0 2s deployment-767b969b79-gzzjn 1/1 Running 0 2m7s deployment-767b969b79-hrtjd 1/1 Running 0 2m7s
One other feature I want to discuss is rolling updates. A rolling update is a way of updating a Deployment (or ReplicaSet) in a controlled and gradual manner. By using a rolling update, your application will (or at least should) not go down.
During a rolling update, the new version of an application is deployed incrementally, with a gradual transition from the old version to the new version. In simple words; Kubernetes will take a Pod from the old version down and replace it with a new version. Once that Pod has started, it will take another Pod down and replace it. It’ll do this until all the Pods are updated. If - for whatever reason - the update fails (the Pods crash, for example), Kubernetes will automatically perform a rollback to the old version.
You can enable rolling updates by adding a strategy field to your Deployment manifest, telling Kubernetes what to do when an update occurs.
strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 maxSurge: 1
Once you’ve created your deployment with the RollingUpdate strategy, you’ll be able to change the Deployment and see the rolling update in action. For example, after changing the image tag, Kubernetes will update the existing Pods using a rolling update.
A DaemonSet is the odd one in this post. It doesn’t support setting the number of replicas nor does it support rolling updates/rollbacks. A DaemonSet is a Kubernetes resource that ensures that a particular Pod is running on each node in the cluster. If a node goes down or is added to the cluster, the DaemonSet controller takes care of creating or terminating Pods to maintain the desired state.
DaemonSets are mainly used for tasks such as collecting logs and monitoring. They provide a convenient way to distribute Pods on every node in a cluster.
Let’s take a look at how we can create a DaemonSet and run a Pod on each node of our cluster. First, we need to create a Kubernetes manifest with a DaemonSet definition in it.
apiVersion: apps/v1 kind: DaemonSet metadata: name: daemonset spec: selector: matchLabels: app: my-cool-application template: metadata: labels: app: my-cool-application spec: containers: - name: container image: image:tag ports: - containerPort: 80
Once we’ve deployed it to the cluster, we should see 3 Pods running, one on each node of our cluster (my cluster has 2 nodes on which it’s running).
[vdeborger@node-01 ~]$ kubectl get pods --selector=app=my-cool-application -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES daemonset-rgnbm 1/1 Running 0 55s 10.244.2.20 node-02 <none> <none> daemonset-xq7rb 1/1 Running 0 55s 10.244.1.20 node-03 <none> <none>
That’s it for today! I hope you learned something new about scaling workloads in Kubernetes using ReplicaSets, Deployments, and DaemonSets. Remember that each of these resources has its own use case, so choose the one that best fits your needs. Happy scaling!