Skip to main content
Version: 23.1

Cluster Balancing

When a topic is created, Redpanda evenly distributes its partitions by sequentially allocating them to the cluster broker (node) with the least number of partitions. By default, Redpanda provides leadership balancing and partition rebalancing when brokers are added or decommissioned.

With an Enterprise license, you can additionally enable Continuous Data Balancing to continuously monitor broker and rack availability and disk usage. This enables self-healing clusters that dynamically balance partitions. It also continuously maintains adherence to rack-aware replica placement policy and self heals after rack (or availability zone) failure or replacement. See Configure Continuous Data Balancing.

Cluster balancing protects you from unbalanced systems that saturate resources on one or more brokers. This can affect throughput and latency. Furthermore, a cluster with replicas on a down broker risks availability loss if more brokers fail, and a cluster that keeps losing brokers without healing eventually risks data loss.

Partition leadership balancing

Automatic partition leadership balancing improves cluster performance by transferring the leadership of a broker's partitions to other replicas. Leadership balancing is enabled by default with the enable_leader_balancer property. Automatic leadership balancing changes where data is read from and written to first. It doesn't move any data. For more information, see Partition leadership elections.

To manually change leadership, use the Admin API:

curl -X POST http://<broker_address>:9644/v1/partitions/kafka/<topic>/<partition>/transfer_leadership?target=<destination-broker-id>

For example, to change leadership to broker 2 for partition 0 on topic foo:

curl -X POST "http://localhost:9644/v1/partitions/kafka/foo/0/transfer_leadership?target=2"
note

In Kubernetes, run the transfer_leadership request on the Pod that is running the current partition leader.

Redpanda partition balancing

Partition balancing moves data from most-loaded brokers to least-loaded brokers to ensure steady-state performance. It ensures predictable and stable performance across all brokers in a cluster. Partition balancing is invoked periodically, determined by the partition_autobalancing_tick_interval_ms property. Default is 30 seconds.

By default, Redpanda rebalances partition distribution when brokers are added or decommissioned. Continuous Data Balancing additionally rebalances partitions when brokers become unavailable or when disk space usage exceeds a threshold.

  • Monitoring unavailable brokers lets Redpanda self-heal clusters by moving partitions from a failed broker to a healthy broker.
  • Monitoring low disk space lets Redpanda distribute partitions across brokers with enough disk space. If free disk space reaches a critically low level, Redpanda blocks clients from producing. For information about the disk space threshold and alert, see Handle full disks.

Partition balancing settings

Select your partition balancing setting with the partition_autobalancing_mode property.

SettingDescription
node_addPartition balancing happens when brokers (nodes) are added.

This is the default setting.
continuousIn this mode, Redpanda continuously monitors the cluster for broker failures and high disk usage. It uses this information to automatically redistribute partitions across the cluster to maintain optimal performance and availability. It also monitors rack availability after failures, and for a given partition, it tries to move excess replicas from racks that have more than one replica to racks where there are none. See Configure Continuous Data Balancing.

This option requires an Enterprise license.
offAll partition balancing from Redpanda is turned off.

This mode is not recommended for production clusters. Only set to off if you need to move partitions manually.

Partition balancing with Kafka API

As an alternative to Redpanda partition balancing, you can change partition assignments explicitly with the Kafka API or with any 3rd-party tool in the Kafka ecosystem that controls partition movement using the Kafka API.

To reassign partitions with the Kafka API:

  1. Set the partition_autobalancing_mode property to off. If Redpanda partition balancing is enabled, Redpanda may change partition assignments regardless of what you do through the Kafka API.

  2. Show initial replica sets. For example, for topic foo:

    rpk topic describe foo -p
    PARTITION LEADER EPOCH REPLICAS LOG-START-OFFSET HIGH-WATERMARK
    0 1 1 [1 2 3] 0 645
    1 1 1 [0 1 2] 0 682
    2 3 1 [0 1 3] 0 672
  3. Put all partition reassignments in a JSON file. For example, to change the replica set of partition 1 from [0 1 2] to [3 1 2] and change the replica set of partition 2 from [0 1 3] to [2 1 3]:

    {
    "version": 1,
    "partitions": [
    {
    "topic": "foo",
    "partition": 1,
    "replicas": [
    3,
    1,
    2
    ]
    },
    {
    "topic": "foo",
    "partition": 2,
    "replicas": [
    2,
    1,
    3
    ]
    }
    ]
    }
  4. Execute partition reassignments with the kafka-reassign-partitions.sh script. This example uses example.json as the name of the JSON file:

    kafka-reassign-partitions.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094,localhost:9095 --reassignment-json-file example.json --execute
    Current partition replica assignment

    {"version":1,"partitions":[{"topic":"foo","partition":1,"replicas":[1,2,0],"log_dirs":["any","any","any"]},{"topic":"foo","partition":2,"replicas":[3,1,0],"log_dirs":["any","any","any"]}]}

    Save this to use as the --reassignment-json-file option during rollback
    Successfully started partition reassignments for foo-1,foo-2
  5. Verify that the reassignment is complete with the flags --verify --preserve-throttles:

    kafka-reassign-partitions.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094,localhost:9095 --reassignment-json-file example.json --verify --preserve-throttles
    Status of partition reassignment:
    Reassignment of partition foo-1 is complete.
    Reassignment of partition foo-2 is complete.

    Alternatively, run rpk topic describe again to show your reassigned replica sets:

    rpk topic describe foo -p
    PARTITION LEADER EPOCH REPLICAS LOG-START-OFFSET HIGH-WATERMARK
    0 3 2 [1 2 3] 0 0
    1 2 2 [1 2 3] 0 0
    2 2 1 [1 2 3] 0 0

To cancel an in-progress partition reassignment with the Kafka API, use the flags --cancel --preserve-throttles:

kafka-reassign-partitions.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094,localhost:9095 --reassignment-json-file example.json --cancel --preserve-throttles
Successfully cancelled partition reassignments for: foo-1,foo-2

Differences in partition balancing between Redpanda and Kafka

  • Kafka's kafka-reassign-partitions.sh script attempts to use throttle configurations that Redpanda does not support, such as replica.alter.log.dirs.io.max.bytes.per.second. Include the flag --preserve-throttles to avoid errors when verifying or canceling a partition reassignment.

  • Kafka supports increasing and decreasing the topic replication factor through partition reassignments. Redpanda currently doesn't support this.

  • In a partition reassignment, you must provide the broker ID for each replica. Kafka validates the broker ID for any new replica that wasn't in the previous replica set against the list of alive brokers. Redpanda validates all replicas against the list of alive brokers.

  • When there are two identical partition reassignment requests, Kafka cancels the first one without returning an error code, while Redpanda rejects one with unknown_server_error.

  • In Kafka, attempts to add partitions to a topic during in-progress reassignments result in a reassignment_in_progress error, while Redpanda successfully adds partitions to the topic.

  • Kafka doesn't support shard-level partition assignments, but Redpanda does. When resolving a partition reassignment, Redpanda automatically determines the shard placements. If you want a partition on a specific shard, you must assign partitions with the Admin API.

Assign partitions at topic creation

To manually assign partitions at topic creation, run:

kafka-topics.sh --create --bootstrap-server 127.0.0.1:9092 --topic custom-assignment --replica-assignment 0:1:2,0:1:2,0:1:2 

What do you like about this doc?




Optional: Share your email address if we can contact you about your feedback.

Let us know what we do well: