Kubernetes StatefulSets Overview

SHARE:

When you use Kubernetes in production environments, you have a long list of options for deploying containerized applications. One of them is Kubernetes StatefulSets, which allow your data to persist when your application containers cease to exist. This includes databases and other data stores and stateful applications.

To learn more, keep reading this overview and  tutorial on K8s StatefulSets. We will start by explaining what StatefulSets are and when to utilize them. Then, we’ll show you how to create, update, and delete them. Finally, we will systematically compare StatefulSets with Deployments, DaemonSets, and ReplicaSets, as well as explain when to use each one.

For the purposes of this tutorial, we created a sample cluster using Kind.

Let’s get started.

What Are Kubernetes StatefulSets

Kubernetes StatefulSet represent a set of pods, each containing unique state requirements. It dictates the needs of dedicated volumes, unique hostname records, and a specific order of deployment. The primary idea behind StatefulSets is to allow developers to deploy applications that require data to be stored in a filesystem with the ability to re-attach to them if they restart by failure. Examples include databases like MySQL, PostgreSQL, and Redis, HTTP servers like NGINX and Apache, and persistent brokers like Kafka and Zookeeper.

When you deploy a StatefulSet, K8s will assign each replica its own state (volumes) and guarantee the order of deployment and updates. For example, if you specify 3 replicas for a StatefulSet, it will deploy them in order and assign each one its own PVCs. When you delete or scale a StatefulSet, this will be done in the same order in which it was first deployed, and it will not delete any of the PVCs so that the safety of the data is ensured.

So, let’s discuss the main reasons why you would use StatefulSets.

When to Use a StatefulSet

You only want to use a StatefulSet when you have specific pod requirements. First, you need to differentiate stateful and stateless applications. In a stateful application, the state is persisted in a file system. Their main responsibility is to manage how the state is accessed. Database systems and applications that use the file system to store information internally are examples of stateful applications.

On the other hand, a stateless application does not hold any client data that can be used in the future or survive after a restart. Examples of stateless applications include software agents, web applications, and lambda functions.

After you have determined which applications are stateful, you want to create a specific deployment strategy for each replica in the StatefulSet. Each replica will be created in order from 0-N and deleted in reverse order from N-0. This makes it possible to set the first pod as primary, for example, and the others as replica pods. The primary pod could handle both read and write requests from the client, and the other pods could sync with the first pod for data replication. When you introduce a new pod by scaling the StatefulSet up, K8s will reserve a new PVC for that pod.

Next, we’ll show you how to create and update StatefulSets.

How to Create a StatefulSet

For this demonstration, we will use the following StatefulSet manifest:

stateful-set.yml

---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-www
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 1Gi
  hostPath:
    path: /www/
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
  labels:
    app: nginx
spec:
  serviceName: "nginx"
  selector:
    matchLabels:
      app: nginx
  replicas: 3
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.21.6
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
      storageClassName: standardCode language: JavaScript (javascript)

There are a few things to note:

  • The Service spec defines a headless service (with clusterIP: None), which means that K8s does not allocate an IP address or forward traffic. Instead, the DNS server will return the individual pod’s IP rather than the service IP (which can be used by the client to connect to any of them).
  • You will need to provision a PersistentVolume for the volumeClaimTemplates. Otherwise, it will block on the Pending state.
  • The StatefulSet spec uses a special volumeClaimTemplates field that defines which template to use for creating a PVC. Each of the replicas in our example will require a unique PVC.

After you have applied the above specification, you can check the status:

❯ kubectl describe statefulsets
Name:               web
Namespace:          default
CreationTimestamp:  Tue, 08 Mar 2022 13:12:38 +0000
Selector:           app=nginx
Labels:             app=nginx
Annotations:        <none>
Replicas:           3 desired | 3 total
Update Strategy:    RollingUpdate
  Partition:        0
Pods Status:        3 Running / 0 Waiting / 0 Succeeded / 0 Failed
…
Events:
  Type    Reason            Age   From                    Message
  ----    ------            ----  ----                    -------
  Normal  SuccessfulCreate  11m   statefulset-controller  create Pod web-0 in StatefulSet web successful
  Normal  SuccessfulCreate  11m   statefulset-controller  create Pod web-1 in StatefulSet web successful
  Normal  SuccessfulCreate  11m   statefulset-controller  create Pod web-2 in StatefulSet web successfulCode language: JavaScript (javascript)

There are many ways to update a StatefulSet. The simplest is to scale the number of replicas up/down by using the following command:

❯ kubectl scale statefulset web --replicas 4
statefulset.apps/web scaled
❯ kubectl get pods -l app
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          4h1m
web-1   1/1     Running   0          4h1m
web-2   1/1     Running   0          4h1m
web-3   1/1     Running   0          6sCode language: JavaScript (javascript)

When you scale a StatefulSet down to 0, you can watch the order of operation. It will scale the last pod followed by the second-to-last, and so on:

❯ kubectl scale statefulset web –replicas 0

❯ kubectl scale statefulset web --replicas 0
❯ kubectl get pods -l app -w
NAME    READY   STATUS        RESTARTS   AGE
web-0   1/1     Running       0          4h4m
web-1   1/1     Running       0          4h4m
web-2   1/1     Running       0          4h4m
web-3   0/1     Terminating   0          3m15s
web-2   1/1     Terminating   0          4h4m
web-2   0/1     Terminating   0          4h4m
web-1   1/1     Terminating   0          4h4m
web-1   0/1     Terminating   0          4h4m
web-0   1/1     Terminating   0          4h4m
web-0   0/1     Terminating   0          4h4mCode language: JavaScript (javascript)

K8s allows you to customize the behavior of the update strategy using the updateStrategy spec field. You can customize the behavior of the PVC retention using the persistentVolumeClaimRetentionPolicy spec field.

If you try to update an existing StatefulSet spec by changing anything other than replicas, template, or updateStrategy, the operation will fail:

❯ kubectl apply -f stateful-set.yml
service/nginx unchanged
The StatefulSet "web" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbiddenCode language: PHP (php)

When you delete a StatefulSet, the PVCs bound to it are not deleted by default. This is to ensure data stability. You must reclaim them independently, as follows:

❯ kubectl get statefulsets
NAME   READY   AGE
web    3/3     24m
❯ kubectl delete statefulsets web
statefulset.apps "web" deleted
❯ kubectl get pvc
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
www-web-0   Bound    pv-www                                     1Gi        RWO            standard       24m
www-web-1   Bound    pvc-35ceb9b1-74e6-42e3-a74b-bd8548249562   1Gi        RWO            standard       24m
www-web-2   Bound    pvc-b2b45bed-d2f2-4293-a419-8616f736b12b   1Gi        RWO            standard       24m
Code language: JavaScript (javascript)

You can reapply the spec, which will create the same pods in order and attach the respective PVCs:

❯ kubectl apply -f stateful-set.yml
service/nginx unchanged
persistentvolume/pv-www unchanged
statefulset.apps/web createdCode language: JavaScript (javascript)

You can inspect the pods associated with the StatefulSet using the following command:

❯ kubectl get pods -l app
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          3h55m
web-1   1/1     Running   0          3h55m
web-2   1/1     Running   0          3h55mCode language: JavaScript (javascript)

Next, we will explain when you should use Deployments instead of StatefulSets.

StatefulSets vs. Deployments

The key reason to use a StatefulSet is to serve a stateful application. For any other case, it’s recommended that you use a Deployment. When you start a Deployment and specify a PVC, it will be shared by all pod replicas (if the volume is read-only). As we’ve seen, each pod in a StatefulSet gets assigned its own PVC.

When you’re scaling Deployments up or down, K8s does not care about the order. It will trigger them all at once. However, the order matters in a StatefulSet, and K8s will maintain that order when scaling up or down to ensure stability.

StatefulSets vs. DaemonSets

A DaemonSet is a unique kind of resource that K8s assigns to a pod for each Kubernetes node in the cluster. For example, if you have 3 nodes, it will schedule 3 DaemonSets one for each node. You won’t have this behavior by default in a StatefulSet unless you specify a NodeAffinity spec field. You can schedule more pods per node as long as the node has enough resources to handle them.

You want to use a DaemonSet rather than a StatefulSet for cross-cutting services like log or app monitors and sidecars. Typically, those services are considered to be long-running, non-critical apps that help facilitate introspection or monitoring.

StatefulSets vs. ReplicaSets 

A ReplicaSet represents a simple replicated pod and is very similar to a Deployment. It uses its pod template much like a StatefulSet uses a pod template. In a ReplicaSet, however, K8s does not handle rolling-updates automatically for you. To update your pods to a new version, you will have to create separate ReplicaSets, then scale them up one-by-one. For most use cases, you should be using Deployments that offer automatic rollouts. Given that ReplicaSets share many commonalities with Deployments, you should consider using them for stateless applications.

Next Steps

This wraps up our deep-dive tutorial on K8s StatefulSets and how to use them in practice. If you liked our content, you can check out our blog to read more upcoming tutorials from Sysdig related to K8s, cloud security, and open source technologies.