Docs Self-Managed Manage Kubernetes Rack Awareness This is documentation for Self-Managed v23.3. To view the latest available version of the docs, see v24.2. Enable Rack Awareness in Kubernetes Rack awareness allows you to distribute replicas of the same partition across different racks to minimize data loss in the event of a rack failure. A rack is a failure zone that has one or more Redpanda brokers assigned to it. When you create a topic, you specify the number of partitions for the topic and the number of partition replicas. By default, Redpanda determines where to place the replicas on the cluster such that each replica is on a different broker, if possible. By defining different racks for a Redpanda cluster, you can specify a preference for the way partition replicas are assigned to brokers. When Redpanda places partition replicas, it takes into account whether a replica has already been placed on a broker in a particular rack. If so, Redpanda chooses a broker in a different rack. This way, partition replicas are distributed across different failure zones, which provides a measure of fault tolerance in the event that a broker or an entire rack becomes unavailable. When rack awareness is enabled, Redpanda places replicas according to these criteria: Number of racks vs. replicas - If the cluster has more racks than the number of replicas, each replica is placed on a broker in a unique rack. If the cluster has fewer racks than the number of replicas, some replicas are placed on brokers in the same rack. Number of available CPU cores - Brokers with more available CPU cores are chosen over brokers with fewer available CPU cores. Broker utilization - Brokers with fewer partitions are chosen over brokers with more partitions. When you enable rack awareness in the Redpanda Helm chart, Kubernetes failure zones are treated as racks. Redpanda maps each rack to a failure zone and places partition replicas across them. For more details about Kubernetes failure zones, see the Kubernetes documentation. Prerequisites You must have the following: Kubernetes cluster: Ensure you have a running Kubernetes cluster, either locally, such as with minikube or kind, or remotely. Kubectl: Ensure you have the kubectl command-line tool installed and configured to communicate with your cluster. If you use the Redpanda Operator, you must deploy it with the --set rbac.createRPKBundleCRs=true flag to give it the required ClusterRoles to read node labels and annotations. Annotate or label Node resources To assign a failure zone to your Kubernetes nodes, ensure that each of your Node resources is annotated or labeled with a key/value pair that corresponds to a failure zone. The Helm chart assigns each Redpanda broker to a particular rack, according to the failure zone of the Kubernetes node on which the broker is running. Managed Kubernetes platforms usually annotate Node resources with the availability zone in which the node instance is hosted. For example topology.kubernetes.io/zone=use-az1. To check the value of the topology.kubernetes.io/zone key, run the following: kubectl get node \ -o=custom-columns=NODE:.metadata.name,ZONE:.metadata.annotations."topology\.kubernetes\.io/zone" Example output: NODE ZONE example-worker use1-az1 example-worker2 use1-az2 example-worker3 use1-az3 If you don’t see any values in the Zone column, make sure to annotate or label your Node resources with key/value pairs that correspond to your fault-tolerance requirements. For example: kubectl annotate node example-worker topology.kubernetes.io/zone=rack1 kubectl annotate node example-worker2 topology.kubernetes.io/zone=rack2 kubectl annotate node example-worker3 topology.kubernetes.io/zone=rack3 If you’re running Redpanda in Amazon AWS, you can use the following DaemonSet to label your Node resources with a zone ID: Example labeler DaemonSet --- apiVersion: v1 kind: ServiceAccount metadata: name: labeler --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: labeler rules: - apiGroups: - "" resources: - nodes verbs: - patch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: labeler roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: labeler subjects: - kind: ServiceAccount name: labeler namespace: <namespace> --- apiVersion: apps/v1 kind: DaemonSet metadata: name: labeler spec: selector: matchLabels: name: labeler template: metadata: labels: name: labeler spec: serviceAccountName: labeler initContainers: - name: labeler image: debian:bullseye-slim imagePullPolicy: IfNotPresent command: - /bin/bash - -c - -- args: - | apt-get update -y && apt-get install -y curl jq apt-transport-https ca-certificates curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg # This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list apt-get update -y && apt-get install -y kubectl # Get a token to be able to interact with the EC2 instance metadata API v2 # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") # Get the current node's AZ ID AZ_ID=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -v "http://169.254.169.254/latest/meta-data/placement/availability-zone-id") kubectl label node/"$HOST" "topology.cloud.redpanda.com/zone-id=$AZ_ID" --overwrite containers: - name: pause image: debian:bullseye-slim imagePullPolicy: IfNotPresent command: - /bin/bash - -c - -- args: - | trap : TERM INT; sleep infinity & wait Configure rack awareness To enable rack awareness in your Redpanda cluster, configure the cluster with the key you used to annotate or label Node resources with the availability zone. Helm + Operator Helm redpanda-cluster.yaml apiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: redpanda spec: chartRef: {} clusterSpec: rackAwareness: enabled: true nodeAnnotation: '<key>' serviceAccount: create: true rbac: enabled: true kubectl apply -f redpanda-cluster.yaml --namespace <namespace> --values --set rack-awareness.yaml rackAwareness: enabled: true nodeAnnotation: '<key>' serviceAccount: create: true rbac: enabled: true helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --values rack-awareness.yaml --reuse-values helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --set rackAwareness.enabled=true \ --set rackAwareness.nodeAnnotation='<key>' \ --set serviceAccount.create=true \ --set rbac.enabled=true rackAwareness.enabled (required): Enables rack awareness for your Redpanda cluster. rackAwareness.nodeAnnotation (required): The label or annotation key to use to define racks. Defaults to the well-known topology.kubernetes.io/zone key. The serviceAccount and rbac configurations are required. These configurations allow the initialization container to securely read the node annotations using the Kubernetes API. Verify that rack awareness is enabled After deploying Redpanda, make sure that rack awareness is enabled and configured on your Redpanda brokers. Make sure that rack awareness has been enabled and configured on your Redpanda brokers: kubectl --namespace <namespace> exec -i -t redpanda-0 -c redpanda -- \ rpk cluster config get enable_rack_awareness Example output: true Next steps Use rack awareness with Continuous Data Balancing to continually maintain the configured replication level, even after a rack failure. For a given partition, Redpanda tries to move excess replicas from racks that have more than one replica to racks that have no replicas. Suggested reading Redpanda Helm Specification Redpanda CRD Reference Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution Audit Logging Remote Read Replicas