Perform a Rolling Restart of Redpanda in Kubernetes

A rolling restart involves restarting one broker at a time while the remaining brokers in your cluster continue running. Rolling restarts help to minimize downtime during a full cluster restart. You should perform a rolling restart during operations such as configuration updates that require a restart, version upgrades, or cluster maintenance.

Prerequisites

You must have the following:

What happens during a rolling restart

When you run Redpanda in Kubernetes, your Redpanda cluster is managed as a StatefulSet where each broker runs inside its own Pod. As a result, you can perform a rolling restart using the Kubernetes API to terminate one Pod at a time, starting from the one with the highest ordinal.

During a rolling restart the Redpanda Helm chart automates the following procedure on each broker, using the preStop and postStart lifecycle hooks:

  1. The preStop hook is executed immediately before a container is terminated. The preStop hook is responsible for the following:

    1. Place the broker into maintenance mode.

      Placing brokers into maintenance mode reduces the risk of interruption or degradation in service. When a broker is placed into maintenance mode, it reassigns its partition leadership to other brokers for all topics that have a replication factor greater than one (three is the default replication factor for topics). Reassigning partition leadership involves draining leadership from the broker and transferring that leadership to another broker.

    2. Terminate the Pod.

      After the preStop hook completes its tasks, Kubernetes sends a SIGTERM signal to the container, signaling it to shut down.

      Maintenance mode may not have finished when the SIGTERM is sent. As a result, Kubernetes waits for the duration of the terminationGracePeriodSeconds for Redpanda to shut down gracefully. If it’s still executing, a SIGKILL is sent to the container to forcefully terminate Redpanda. The Pod is then terminated and restarted due to the default rolling update policy of the StatefulSet.

      The default terminationGracePeriod is 90 seconds, which should be long enough for maintenance mode to finish in large clusters. You can test different values in a development environment. To configure the terminationGracePeriod, use the statefulset.terminationGracePeriodSeconds setting.

  2. The postStart hook is executed immediately after a container is created. The postStart hook takes the broker out of maintenance mode. This action re-integrates the broker into the cluster, allowing it to start handling requests and participate in the cluster’s operations again.

Impact of broker restarts

When brokers restart, clients may experience higher latency, nodes may experience CPU spikes when the broker becomes available again, and you may receive alerts about under-replicated partitions. Topics that weren’t using replication (that is, topics that had replication.factor=1) will be unavailable.

Temporary increase in latency on clients (producers and consumers)

When you restart one or more brokers in a cluster, clients (consumers and producers) may experience higher latency due to partition leadership reassignment. Because clients must communicate with the leader of a partition, they may send a request to a broker whose leadership has been transferred, and receive NOT_LEADER_FOR_PARTITION. In this case, clients must request metadata from the cluster to find out the address of the new leader. Clients refresh their metadata periodically, or when the client receives some retryable errors that indicate that the metadata may be stale. For example:

  1. Broker A shuts down.

  2. Client sends a request to broker A, and receives NOT_LEADER_FOR_PARTITION.

  3. Client requests metadata, and learns that the new leader is broker B.

  4. Client sends the request to broker B.

CPU spikes upon broker restart

When a restarted broker becomes available again, you may see your nodes' CPU usage increase temporarily. This temporary increase in CPU usage is due to the cluster rebalancing the partition replicas.

Under-replicated partitions

When a broker is in maintenance mode, Redpanda continues to replicate updates to that broker. When a broker is taken offline during a restart, partitions with replicas on the broker could become out of sync until it is brought back online. Once the broker is available again, data is copied to its under-replicated replicas until all affected partitions are in sync with the partition leader.

Perform a rolling restart

  1. Check for topics that have a replication factor greater than one.

    Partitions that live on only one broker will be offline during the restart. If you have topics with a replication factor of 1, and if you have sufficient disk space, temporarily increase the replication factor to limit outages for these topics during the rolling upgrade.

  2. Ensure that the cluster is healthy:

    kubectl exec <pod-name> --namespace <namespace> -c redpanda -- \
      rpk cluster health

    The draining process won’t start until the cluster is healthy.

    Example output:
    CLUSTER HEALTH OVERVIEW
    =======================
    Healthy:                     true (1)
    Controller ID:               0
    All nodes:                   [0 1 2] (2)
    Nodes down:                  [] (3)
    Leaderless partitions:       [] (3)
    Under-replicated partitions: [] (3)
    1 The cluster is either healthy (true) or unhealthy (false).
    2 The node IDs of all brokers in the cluster.
    3 These fields contain data only when the cluster is unhealthy.
  3. Trigger a rolling restart of all Pods in the StatefulSet:

    kubectl rollout restart statefulset redpanda --namespace=<namespace>
  4. Wait for all Pods to restart:

    kubectl rollout status statefulset redpanda --namespace=<namespace> --watch

Verify the cluster’s health

To verify that the cluster is running properly, run:

kubectl exec <pod-name> --namespace <namespace> -c redpanda -- \
  rpk cluster health

To view additional information about your brokers, run:

kubectl exec <pod-name> --namespace <namespace> -c redpanda -- \
  rpk redpanda admin brokers list

Suggested reading