Benchmark Redpanda

Learn how to measure the performance of a Redpanda cluster deployed on AWS EC2 instances with the Linux Foundation’s OpenMessaging Benchmark. Run the same tests and workloads that Redpanda uses to demonstrate significantly better performance than Apache Kafka.

About OpenMessaging Benchmark

The Linux Foundation’s OpenMessaging Benchmark (OMB) Framework is an open-source, cloud-based benchmark framework that supports several messaging systems, including Kafka, and is configurable for workloads representing real-world use cases.

Redpanda Data provides a fork of OMB on Github with some updates:

  • Fixed coalescing of asynchronous consumer offset requests in the OMB Kafka driver.

  • Support for Kafka 3.2.0 clients.

OMB workloads

An OMB workload is a benchmark configuration that sets the producers, consumers, topics, and messages used by a test, as well as the production rate and duration of each test. An OMB workload is specified in a YAML configuration file.

Example workload configuration file

The content of an OMB workload configuration file, copied from Redpanda Data’s fork of OMB:

name: 1 topic / 1 partition / 1Kb

topics: 1
partitionsPerTopic: 1
keyDistributor: "NO_KEY"
messageSize: 1024
payloadFile: "payload/payload-1Kb.data"
subscriptionsPerTopic: 1
consumerPerSubscription: 1
producersPerTopic: 1
producerRate: 50000
consumerBacklogSizeGB: 0
testDurationMinutes: 15

The keyDistributor property configures how keys are distributed and assigned to messages. - NO_KEY sets null for all keys. - KEY_ROUND_ROBIN cycles through a finite set of keys in round-robin fashion. - RANDOM_NANO returns random keys based on System.nanoTime().

Set up benchmark

Running OMB with Redpanda requires setting up your local environment to provision and start a Redpanda cluster in AWS.

  1. Install CLI tools.

    • Maven

    • Terraform with terraform-inventory plugin

    • Ansible (v2.11 or higher)

    • Python 3 and pip

    • A window manager like tmux or screen that supports detachable screen sessions.

      Redpanda Data recommends running the benchmark executable with a window manager that supports detachable screen sessions, like tmux or screen, so the benchmark can continue to run in the background even after you disconnect.
  2. Clone the Redpanda Data fork of OMB.

    git clone https://github.com/redpanda-data/openmessaging-benchmark

    The repository contains a directory for the Redpanda driver, openmessaging-benchmark/driver-redpanda. Subsequent steps read and configure files in that directory.

  3. Customize the openmessaging-benchmark/driver-redpanda/pom.xml file with your Kafka client version if necessary (currently 3.3.1):

    pom.xml
    ...
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>3.3.1</version>
    </dependency>
    ...
  4. From the repository root directory, build the benchmark client.

    cd openmessaging-benchmark
    mvn clean install -Dlicense.skip=true
  5. From the Redpanda driver directory, install the Ansible roles required for deploying Redpanda.

    cd driver-redpanda/deploy
    ansible-galaxy install -r requirements.yaml
  6. Configure AWS credentials and SSH keys.

    1. Install and configure AWS CLI.

    2. Generate SSH keys:

      ssh-keygen -f ~/.ssh/redpanda_aws

      When prompted for a passphrase, set a blank passphrase by pressing Enter twice.

    3. Verify the SSH key files were created.

      ls ~/.ssh/redpanda_aws*
  7. Provision a Redpanda cluster to deploy on AWS with Terraform.

    1. Customize the openmessaging-benchmark/deploy/terraform.tfvars Terraform configuration file for your environment.

      Default Terraform configuration for Redpanda on AWS

      The default contents of openmessaging-benchmark/driver-redpanda/deploy/terraform.tfvars:

      public_key_path = "~/.ssh/redpanda_aws.pub"
      region          = "us-west-2"
      az              = "us-west-2a"
      ami             = "ami-0d31d7c9fc9503726"
      profile         = "default"
      instance_types = {
      "redpanda"      = "i3en.6xlarge"
      "client"        = "m5n.8xlarge"
      "prometheus"    = "c5.2xlarge"
      }
      num_instances = {
      "client"     = 4
      "redpanda"   = 3
      "prometheus" = 1
      }
    2. From the Redpanda driver deployment directory, initialize the Terraform deployment of Redpanda on AWS.

      cd driver-redpanda/deploy
      terraform init
      terraform apply -auto-approve
      The terraform apply command prompts you for an owner name (var.owner) that is used to tag all the cloud resources that will be created. Once the installation is complete, you will see a confirmation message listing the resources that have been installed.
  8. Run the Ansible playbook to install and start the Redpanda cluster.

    Redpanda can run with or without TLS and SASL enabled.

    • To run Redpanda without TLS and SASL:

      ansible-playbook deploy.yaml
    • To run Redpanda with TLS and SASL:

      ansible-playbook deploy.yaml -e "tls_enabled=true sasl_enabled=true"

      If the path to your SSH private key isn’t ~/.ssh/redpanda_aws, add the --private-key flag to your Ansible command.

      ansible-playbook deploy.yaml --private-key=<private-key-path>
      Beginning with Ansible 2.14, references to args: warn within Ansible tasks cause a fatal error and halt the execution of the playbook. You may find instances of this in the components installed by ansible-galaxy, particularly in the cloudalchemy.grafana task in dashboards.yml. To resolve this issue, removing the warn line in from the yml file.

Run benchmark

Connect to the benchmark’s client and run the benchmark with a custom workload.

  1. Connect with SSH to the benchmark client, with its IP address retrieved from the client_ssh_host output of Terraform.

    ssh -i ~/.ssh/redpanda_aws ubuntu@$(terraform output --raw client_ssh_host)
  2. On the client, navigate to the /opt/benchmark directory.

    cd /opt/benchmark
  3. Create a workload configuration file. For example, create a .yaml file with one topic, 144 partitions, 500 MBps producer rate, four producers, and four consumers:

    cat > workloads/1-topic-144-partitions-500mb-4p-4c.yaml << EOF
    name: 500mb/sec rate; 4 producers 4 consumers; 1 topic with 144 partitions
    
    topics: 1
    partitionsPerTopic: 144
    
    messageSize: 1024
    useRandomizedPayloads: true
    randomBytesRatio: 0.5
    randomizedPayloadPoolSize: 1000
    
    subscriptionsPerTopic: 1
    consumerPerSubscription: 4
    producersPerTopic: 4
    
    producerRate: 500000
    
    consumerBacklogSizeGB: 0
    testDurationMinutes: 30
    EOF

    Alternatively, you can use an existing workload file from the Redpanda repo, in openmessaging-benchmark/driver-redpanda/deploy/workloads/.

    Workloads from Redpanda vs. Kafka comparison

    The workloads from the Redpanda vs. Kafka benchmark comparison can be gotten from the chart in the comparison:

    kafka vs redpanda performance 8
  4. Create or reuse a client configuration file. This file configures the Redpanda producer and consumer clients, as well as topics.

    The rest of the guide uses the openmessaging-benchmark/driver-redpanda/redpanda-ack-all-group-linger-1ms.yaml configuration file.

    Client configuration from Redpanda vs. Kafka comparison

    The client configuration from the Redpanda vs. Kafka benchmark comparison can be gotten from the code listing in the comparison:

    topicConfig: |
        min.insync.replicas=2
        flush.messages=1
        flush.ms=0
    producerConfig: |
        acks=all
        linger.ms=1
        batch.size=131072
    consumerConfig: |
        auto.offset.reset=earliest
        enable.auto.commit=false
        auto.commit.interval.ms=0
        max.partition.fetch.bytes=131072
    Configure reset=false and manually delete the generated topic after the benchmark completes. Otherwise, when reset=true, the benchmark can fail due to it erroneously trying to delete the _schemas topic.
  5. Run the benchmark with your workload and client configuration.

    sudo bin/benchmark -d \
    driver-redpanda/redpanda-ack-all-group-linger-1ms.yaml \
    workloads/1-topic-144-partitions-500mb-4p-4c.yaml

View benchmark results

After a run completes, the benchmark generates results as *.json files in /opt/benchmark.

Redpanda provides a Python script, generate_charts.py, to generate charts of benchmark results. To run the script:

  1. Copy the results from the client to your local machine.

    exit; # back to your local machine
    mkdir ~/results
    scp -i ~/.ssh/redpanda_aws ubuntu@$(terraform output --raw client_ssh_host):/opt/benchmark/*.json ~/results/
  2. From the root directory of the repository, install the prerequisite packages for the Python script.

    cd ../../bin # openmessaging-benchmark/bin
    python3 -m pip -r install requirements.txt
  3. To list all options, run the script with the -h flag.

    ./generate_charts.py -h
  4. To generate charts from your ~/results/ directory, first create an ~/output directory, then run the script with --results and --output options set accordingly.

    mkdir ~/output
    ./generate_charts.py --results ~/results --output ~/output
  5. In ~/output, verify the generated charts are in an HTML page with charts for throughput, publish latency, end-to-end latency, publish rate, and consume rate.

Tear down benchmark

When done running the benchmark, tear down the Redpanda cluster.

terraform destroy -auto-approve