Controllers

Kubernetes controllers are crucial components that constantly monitor the state of the Kubernetes cluster and ensure that the actual state of the system matches the desired state defined in the manifests. Controllers are a key part of Kubernetes’ self-healing and automation capabilities, making it possible to manage complex, distributed applications efficiently.

What is a Controller?

A controller in Kubernetes is a control loop that watches the state of your cluster and makes changes to ensure that the cluster’s actual state matches the desired state described in your manifests. The desired state is defined by Kubernetes objects such as Deployments, Services, or Pods, and the controller takes action when there is a difference between the current state and the desired state.

How Controllers Work

Controllers follow a common pattern:

Observe: The controller watches the current state of the cluster by querying the Kubernetes API server.
Compare: It compares the current state of the cluster to the desired state as specified by the manifests.
Act: If the current state does not match the desired state, the controller takes corrective action to reconcile the two.

For example, if a Deployment specifies that there should be three replicas of a Pod running, but only two are running due to a node failure, the controller will create a new Pod to replace the missing one.

Types of Controllers and Their Examples

Kubernetes includes several built-in controllers, each responsible for a specific type of resource. Below are some of the most important controllers with detailed examples.

1. ReplicationController

Purpose: Ensures that a specified number of pod replicas are running at all times.

Scenario: You have a simple web application, and you want to ensure that exactly three instances of this application are always running.
Manifest Example:

  apiVersion: v1
  kind: ReplicationController
  metadata:
    name: my-replication-controller
  spec:
    replicas: 3
    selector:
      app: myapp
    template:
      metadata:
        labels:
          app: myapp
      spec:
        containers:
        - name: nginx
          image: nginx:1.21
          ports:
          - containerPort: 80

How It Works:
The ReplicationController checks the number of Pods that match the selector (app: myapp).
If fewer than three Pods are running, it creates new Pods.
If more than three are running (perhaps due to a manual intervention), it deletes the extra Pods.

2. ReplicaSet

Purpose: Like the ReplicationController but with more advanced features, including support for label selectors with set-based expressions.

Scenario: You want to maintain three instances of a specific version of your application.
Manifest Example:

  apiVersion: apps/v1
  kind: ReplicaSet
  metadata:
    name: my-replicaset
  spec:
    replicas: 3
    selector:
      matchLabels:
        app: myapp
        version: v1
    template:
      metadata:
        labels:
          app: myapp
          version: v1
      spec:
        containers:
        - name: nginx
          image: nginx:1.21
          ports:
          - containerPort: 80

How It Works:
The ReplicaSet controller monitors the Pods and ensures there are exactly three Pods matching the labels app: myapp and version: v1.
If a Pod is deleted or fails, the ReplicaSet controller creates a new one.
ReplicaSet is typically used by Deployments to manage Pods.

3. Deployment Controller

Purpose: Provides declarative updates to Pods and ReplicaSets, handles rolling updates, rollbacks, and scaling of applications.

Scenario: You need to deploy a new version of your application without downtime.
Manifest Example:

  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: my-deployment
  spec:
    replicas: 3
    selector:
      matchLabels:
        app: myapp
    template:
      metadata:
        labels:
          app: myapp
      spec:
        containers:
        - name: nginx
          image: nginx:1.21
          ports:
          - containerPort: 80
    strategy:
      type: RollingUpdate
      rollingUpdate:
        maxUnavailable: 1
        maxSurge: 1

How It Works:
The Deployment controller manages the rollout of new versions of an application.
It creates a new ReplicaSet for the new version and incrementally replaces the old Pods with new ones, following the strategy specified (e.g., RollingUpdate).
If something goes wrong during the update, you can rollback to the previous ReplicaSet.

4. StatefulSet Controller

Purpose: Manages stateful applications, ensuring the order and uniqueness of Pods.

Scenario: You are deploying a distributed database like Cassandra or a messaging system like Kafka, where the order and identity of each Pod is important.
Manifest Example:

  apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    name: my-statefulset
  spec:
    serviceName: "my-service"
    replicas: 3
    selector:
      matchLabels:
        app: myapp
    template:
      metadata:
        labels:
          app: myapp
      spec:
        containers:
        - name: nginx
          image: nginx:1.21
          ports:
          - containerPort: 80
    volumeClaimTemplates:
    - metadata:
        name: my-storage
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 1Gi

How It Works:
The StatefulSet controller ensures that the Pods are created in order (e.g., my-statefulset-0, my-statefulset-1, my-statefulset-2).
If a Pod fails, it is recreated with the same name, maintaining its identity.
Each Pod has its own persistent storage, which is preserved across rescheduling.

5. DaemonSet Controller

Purpose: Ensures that a copy of a Pod runs on all or some nodes in the cluster.

Scenario: You need to deploy a logging or monitoring agent (like Fluentd or Prometheus node exporter) on every node in your cluster.
Manifest Example:

  apiVersion: apps/v1
  kind: DaemonSet
  metadata:
    name: my-daemonset
  spec:
    selector:
      matchLabels:
        app: myapp
    template:
      metadata:
        labels:
          app: myapp
      spec:
        containers:
        - name: my-agent
          image: my-monitoring-agent:1.0
          ports:
          - containerPort: 9100

How It Works:
The DaemonSet controller ensures that a Pod runs on all or a specific subset of nodes.
As new nodes are added to the cluster, the DaemonSet automatically schedules the required Pods on them.
If a node is removed, the Pods on that node are also removed.

6. Job Controller

Purpose: Manages Jobs, which are used for running tasks that should be completed (e.g., batch processing tasks).

Scenario: You need to run a batch job that processes data and completes when the task is finished.
Manifest Example:

  apiVersion: batch/v1
  kind: Job
  metadata:
    name: my-job
  spec:
    template:
      metadata:
        labels:
          app: myjob
      spec:
        containers:
        - name: batch-job
          image: my-batch-image:1.0
        restartPolicy: OnFailure
    backoffLimit: 4

How It Works:
The Job controller ensures that the specified task runs to completion.
If the Pod fails, it will restart according to the specified backoffLimit.
Once the Job completes successfully, it does not restart the Pods.

7. CronJob Controller

Purpose: Manages time-based jobs, like running a Job on a schedule (similar to cron in Unix/Linux).

Scenario: You need to run a backup job every night at midnight.
Manifest Example:

  apiVersion: batch/v1
  kind: CronJob
  metadata:
    name: my-cronjob
  spec:
    schedule: "0 0 * * *"
    jobTemplate:
      spec:
        template:
          spec:
            containers:
            - name: cronjob
              image: my-backup-image:1.0
            restartPolicy: OnFailure

How It Works:
The CronJob controller manages the scheduling and execution of Jobs based on the specified cron schedule.
At the specified time (0 0 * * * for midnight), the controller creates a new Job object to execute the task.
If a Job fails, it will retry according to the Job’s restartPolicy.

Summary

Controllers in Kubernetes are the automation backbone of the system. They ensure that the desired state of your applications, as defined in the manifests, is continuously maintained across the cluster. By abstracting complex operational tasks like scaling, self-healing, and