Kubernetes Requirements and Recommendations
This topic provides the requirements and recommendations for provisioning Kubernetes clusters and worker nodes for running Redpanda in production.
Kubernetes cluster requirements
This section provides the requirements for setting up a Kubernetes cluster to run Redpanda.
Number of worker nodes
Provision one dedicated worker node for each Redpanda broker that you plan to deploy in your Redpanda cluster. Each Pod replica that runs a Redpanda broker requires its own dedicated worker node for the following reasons:
-
Resource isolation: Redpanda brokers are designed to make full use of available system resources, including CPU and memory. By dedicating a worker node to each broker, you ensure that these resources aren’t shared with other applications or processes, avoiding potential performance bottlenecks or contention.
-
External networking: External clients should connect directly to the broker that owns the partition they’re interested in. This means that each broker must be individually addressable. As clients must connect to the specific broker that is the leader of the partition, they need a mechanism to directly address each broker in the cluster. Assigning each broker to its own dedicated worker node makes this direct addressing feasible, since each worker node will have a unique address. See External networking.
-
Fault tolerance: Ensuring each broker operates on a separate node enhances fault tolerance. If one node experiences issues, it won’t directly impact the other brokers.
The Redpanda Helm chart configures podAntiAffinity rules to make sure that each Redpanda broker runs on its own worker node.
|
Recommendations: Deploy at least three Pod replicas.
CPU
-
Two physical, not virtual, cores for each worker node.
-
x86_64 (Westmere or newer) and AWS Graviton family processors are supported.
Recommendations:
-
Four physical cores are strongly recommended.
Storage
An XFS or ext4 filesystem.
The Redpanda data directory (/var/lib/redpanda/data
) and the Tiered Storage cache must be mounted on an XFS or ext4 filesystem. For information about supported volume types for different data in Redpanda, see Supported Volume Types for Data in Redpanda.
Avoid using NFS (Network File System) for the Redpanda data directory or the Tiered Storage cache. |
Recommendations:
-
Use an XFS filesystem for its enhanced performance with Redpanda workloads.
-
Use local PersistentVolumes that are backed by locally attached NVMe disks to store the Redpanda data directory.
-
For setups with multiple disks, use a RAID-0 (striped) array. It boosts speed but lacks redundancy. A disk failure can lead to data loss.
Security
Recommendations: If you’re using a cloud platform, use IAM roles to restrict access to resources in your cluster.
External networking
-
For external access, each worker node in your cluster must have a static, externally accessible IP address.
-
Minimum 10 GigE (10 Gigabit Ethernet) connection to ensure:
-
High data throughput
-
Reduced data transfer latency
-
Scalability for increased network traffic
-
Recommendations: Use the default NodePort Service to give external clients access to the Redpanda brokers that are running on those nodes.
Tuning
Before deploying Redpanda to production, each worker node that runs Redpanda must be tuned to optimize the Linux kernel for Redpanda processes. See Tuning Kubernetes Worker Nodes for Production.
Redpanda cluster recommendations
This section provides the recommendations for deploying Redpanda.
Deploy at least three Pod replicas
Redpanda Data recommends at least three Pod replicas (Redpanda brokers) to use as seed servers. Seed servers are used to bootstrap the gossip process for new brokers joining a cluster. When a new broker joins, it connects to the seed servers to find out the topology of the Redpanda cluster. A larger number of seed servers makes consensus more robust and minimizes the chance of unwanted clusters forming when brokers are restarted without any data.
By default, the Redpanda Helm chart deploys a StatefulSet with three Redpanda brokers. You can specify the number of Redpanda brokers in the statefulset.replicas
configuration.
Set resource requests and limits for memory and CPU
In a production cluster, the resources you allocate to Redpanda should be proportionate to your machine type. Redpanda Data recommends that you determine and set these values before deploying the cluster. For instructions on setting Pod resources, see Manage Pod Resources in Kubernetes.
Use local PersistentVolumes backed by NVMe disks
Redpanda Data recommends using PersistentVolumes (PVs) that are backed by locally attached NVMe devices to store the Redpanda data directory. NVMe devices outperform traditional SSDs or HDDs.
When working with local NVMe disks, provisioning can pose challenges. Dynamic provisioners, though highly scalable and automated, may not always support local PVs. You can either:
-
Create the PVs manually.
-
Use one of the following CSI drivers to automatically create one PV on each node that has local SSDs available:
-
Recommended: Local volume manager (LVM)
-
LVM is a more advanced tool for local storage management because it allows you to group physical storage devices into a logical volume group. Allocating logical volumes from a logical volume group provides greater flexibility in terms of storage expansion and management. LVM supports features such as resizing, snapshots, and striping, which are not available with the simpler local volume static provisioner. |
Configure your PVs with a reference to a StorageClass, then create that StorageClass to provide the Redpanda Helm chart a way of creating PVCs that use your local NVMe disks.
By default, the Redpanda Helm chart uses the default StorageClass in your Kubernetes cluster to create one PersistentVolumeClaim (PVC) for each Redpanda broker. To learn how to configure a different StorageClass, see Store the Redpanda Data Directory in PersistentVolumes.
This example configures a StorageClass for provisioning locally attached storage with an XFS filesystem.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-xfs-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
parameters:
fsType: xfs
For details about StorageClasses, see the Kubernetes documentation.
Use a NodePort Service for external access
The NodePort Service provides the lowest latency of all the Kubernetes Services because it does not include any unnecessary routing or middleware. Client connections go to the Redpanda brokers in the most direct way possible, through the worker nodes.
By default, the Redpanda Helm chart creates a NodePort Service with the following ports:
Node port | Purpose |
---|---|
30081 |
Schema registry |
30082 |
HTTP Proxy |
31092 |
Kafka API |
31644 |
Admin API |
To change these ports, see Configure Listeners in Kubernetes.
Depending on your deployment and security policies, you may not be able to access worker nodes through a NodePort Service. If you choose to use another Service, consider the impact on the cost and performance of your deployment:
-
LoadBalancer Service: To make each Redpanda broker accessible with LoadBalancer Services, you need one LoadBalancer Service for each Redpanda broker so that requests can be routed to specific brokers rather than balancing requests across all brokers. Load balancers are expensive, add latency and occasional packet loss, and add an unnecessary layer of complexity.
-
Ingress: To make each Redpanda broker accessible with Ingress, you must run an Ingress controller and set up routing to each Redpanda broker. Routing adds latency and can be a throughput bottleneck.
For more details, see Networking and Connectivity.
Use ExternalDNS for external access
Redpanda Data recommends using ExternalDNS to manage DNS records for your Pods' domains. ExternalDNS synchronizes exposed Kubernetes Services with various DNS providers, rendering Kubernetes resources accessible through DNS servers.
Benefits of ExternalDNS include:
-
Automation: ExternalDNS automatically configures public DNS records when you create, update, or delete Kubernetes Services or Ingresses. This eliminates the need for manual DNS configuration, which can be error-prone.
-
Compatibility: ExternalDNS is compatible with a wide range of DNS providers, including major cloud providers such as AWS, Google Cloud, and Azure, and DNS servers like CoreDNS and PowerDNS.
-
Integration with other tools: ExternalDNS can be used in conjunction with other Kubernetes tools, such as ingress controllers or cert-manager for managing TLS certificates.
You can use ExternalDNS with the default NodePort Service or with LoadBalancer Services.