Requirements and Recommendations

This topic provides the requirements and recommendations for provisioning servers to run Redpanda in production.

Operating system

  • Minimum version required of RHEL/CentOS: 8. Recommended: 9+

  • Minimum version required of Ubuntu: 20.04 LTS. Recommended: 22.04+

Recommendation: Linux kernel 4.19 or later for better performance.

Number of nodes

Provision one physical node or virtual machine (VM) for each Redpanda broker that you plan to deploy in your Redpanda cluster. Each Redpanda broker requires its own dedicated node for the following reasons:

  • Resource isolation: Redpanda brokers are designed to make full use of available system resources, including CPU and memory. By dedicating a node to each broker, you ensure that these resources aren’t shared with other applications or processes, avoiding potential performance bottlenecks or contention.

  • External networking: External clients should connect directly to the broker that owns the partition they’re interested in. This means that each broker must be individually addressable. As clients must connect to the specific broker that is the leader of the partition, they need a mechanism to directly address each broker in the cluster. Assigning each broker to its own dedicated node makes this direct addressing feasible, since each node will have a unique address. See External networking.

  • Fault tolerance: Ensuring each broker operates on a separate node enhances fault tolerance. If one node experiences issues, it won’t directly impact the other brokers.

Recommendations: Deploy at least three Redpanda brokers.

CPU and memory

Requirements:

  • Two physical, not virtual, cores for each node.

  • x86_64 (Westmere or newer) and AWS Graviton family processors are supported.

  • 2 GB or more of memory per core.

  • 4 MB of memory for each topic partition replica. You can enforce this requirement in the tunable topic_memory_per_partition property.

Recommendations:

  • Four physical cores for each node are strongly recommended.

Storage

Requirements:

  • An XFS or ext4 file system.

    The Redpanda data directory (/var/lib/redpanda/data) and the Tiered Storage cache must be mounted on an XFS or ext4 file system.

    Avoid using NFS (Network File System) for the Redpanda data directory or the Tiered Storage cache.

Recommendations:

  • Use an XFS file system for its enhanced performance with Redpanda workloads.

  • For setups with multiple disks, use a RAID-0 (striped) array. It boosts speed but lacks redundancy. A disk failure can lead to data loss.

Security

Recommendations:

  • If you’re using a cloud platform, use IAM roles to restrict access to resources in your cluster.

  • Secure your Redpanda cluster with TLS encryption and SASL authentication.

External networking

  • For external access, each node in your cluster must have a static, externally accessible IP address.

  • Minimum 10 GigE (10 Gigabit Ethernet) connection to ensure:

    • High data throughput

    • Reduced data transfer latency

    • Scalability for increased network traffic

Tuning

Before deploying Redpanda to production, each node that runs Redpanda must be tuned to optimize the Linux kernel for Redpanda processes.

Object storage providers for Tiered Storage

Redpanda supports the following storage providers for Tiered Storage:

  • Amazon Simple Storage Service (S3)

  • Google Cloud Storage (GCS), using the Google Cloud Platform S3 API

  • Azure Blob Storage (ABS)

Cloud instance types

Recommendations:

  • Use a cloud instance type that supports locally attached NVMe devices with an XFS file system. NVMe devices offer high I/O operations per second (IOPS) and minimal latency, while XFS offers enhanced performance with Redpanda workloads.

Amazon

  • General purpose: General-purpose instances provide a balance of compute, memory, and networking resources, and they can be used for a variety of diverse workloads.

  • Memory optimized: Memory-optimized instances are designed to deliver fast performance for workloads that process large data sets in memory.

  • Storage optimized: Storage-optimized instances are designed for workloads that require high, sequential read and write access to very large data sets on local storage. They are optimized to deliver tens of thousands of low-latency, random IOPS to applications.

  • Compute optimized: Compute-optimized instances deliver cost-effective high performance at a low price per compute ratio for running advanced compute-intensive workloads.

Azure

Google

  • General purpose: The general-purpose machine family has the best price-performance with the most flexible vCPU to memory ratios, and provides features that target most standard and cloud-native workloads.

  • Memory optimized: The memory-optimized machine family provides the most compute and memory resources of any Compute Engine machine family offering. They are ideal for workloads that require higher memory-to-vCPU ratios than the high-memory machine types in the general-purpose N1 machine series.

  • Compute optimized: Compute-optimized VM instances are ideal for compute-intensive and high-performance computing (HPC) workloads.