Docs Self-Managed Develop Benchmark Redpanda Benchmark Redpanda Learn how to measure the performance of a Redpanda cluster deployed on AWS EC2 instances with the Linux Foundation’s OpenMessaging Benchmark. Run the same tests and workloads that Redpanda uses to demonstrate significantly better performance than Apache Kafka. About OpenMessaging Benchmark The Linux Foundation’s OpenMessaging Benchmark (OMB) Framework is an open-source, cloud-based benchmark framework that supports several messaging systems, including Kafka, and is configurable for workloads representing real-world use cases. Redpanda Data provides a fork of OMB on Github with some updates: Fixed coalescing of asynchronous consumer offset requests in the OMB Kafka driver. Support for Kafka 3.2.0 clients. OMB workloads An OMB workload is a benchmark configuration that sets the producers, consumers, topics, and messages used by a test, as well as the production rate and duration of each test. An OMB workload is specified in a YAML configuration file. Example workload configuration file The content of an OMB workload configuration file, copied from Redpanda Data’s fork of OMB: name: 1 topic / 1 partition / 1Kb topics: 1 partitionsPerTopic: 1 keyDistributor: "NO_KEY" messageSize: 1024 payloadFile: "payload/payload-1Kb.data" subscriptionsPerTopic: 1 consumerPerSubscription: 1 producersPerTopic: 1 producerRate: 50000 consumerBacklogSizeGB: 0 testDurationMinutes: 15 The keyDistributor property configures how keys are distributed and assigned to messages. - NO_KEY sets null for all keys. - KEY_ROUND_ROBIN cycles through a finite set of keys in round-robin fashion. - RANDOM_NANO returns random keys based on System.nanoTime(). Set up benchmark Running OMB with Redpanda requires setting up your local environment to provision and start a Redpanda cluster in AWS. Install CLI tools. Maven Terraform with terraform-inventory plugin Ansible (v2.11 or higher) Python 3 and pip A window manager like tmux or screen that supports detachable screen sessions. Redpanda Data recommends running the benchmark executable with a window manager that supports detachable screen sessions, like tmux or screen, so the benchmark can continue to run in the background even after you disconnect. Clone the Redpanda Data fork of OMB. git clone https://github.com/redpanda-data/openmessaging-benchmark The repository contains a directory for the Redpanda driver, openmessaging-benchmark/driver-redpanda. Subsequent steps read and configure files in that directory. Customize the openmessaging-benchmark/driver-redpanda/pom.xml file with your Kafka client version if necessary (currently 3.3.1): pom.xml ... <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>3.3.1</version> </dependency> ... From the repository root directory, build the benchmark client. cd openmessaging-benchmark mvn clean install -Dlicense.skip=true From the Redpanda driver directory, install the Ansible roles required for deploying Redpanda. cd driver-redpanda/deploy ansible-galaxy install -r requirements.yaml Configure AWS credentials and SSH keys. Install and configure AWS CLI. Generate SSH keys: ssh-keygen -f ~/.ssh/redpanda_aws When prompted for a passphrase, set a blank passphrase by pressing Enter twice. Verify the SSH key files were created. ls ~/.ssh/redpanda_aws* Provision a Redpanda cluster to deploy on AWS with Terraform. Customize the openmessaging-benchmark/deploy/terraform.tfvars Terraform configuration file for your environment. Default Terraform configuration for Redpanda on AWS The default contents of openmessaging-benchmark/driver-redpanda/deploy/terraform.tfvars: public_key_path = "~/.ssh/redpanda_aws.pub" region = "us-west-2" az = "us-west-2a" ami = "ami-0d31d7c9fc9503726" profile = "default" instance_types = { "redpanda" = "i3en.6xlarge" "client" = "m5n.8xlarge" "prometheus" = "c5.2xlarge" } num_instances = { "client" = 4 "redpanda" = 3 "prometheus" = 1 } From the Redpanda driver deployment directory, initialize the Terraform deployment of Redpanda on AWS. cd driver-redpanda/deploy terraform init terraform apply -auto-approve The terraform apply command prompts you for an owner name (var.owner) that is used to tag all the cloud resources that will be created. Once the installation is complete, you will see a confirmation message listing the resources that have been installed. Run the Ansible playbook to install and start the Redpanda cluster. Redpanda can run with or without TLS and SASL enabled. To run Redpanda without TLS and SASL: ansible-playbook deploy.yaml To run Redpanda with TLS and SASL: ansible-playbook deploy.yaml -e "tls_enabled=true sasl_enabled=true" If the path to your SSH private key isn’t ~/.ssh/redpanda_aws, add the --private-key flag to your Ansible command. ansible-playbook deploy.yaml --private-key=<private-key-path> Beginning with Ansible 2.14, references to args: warn within Ansible tasks cause a fatal error and halt the execution of the playbook. You may find instances of this in the components installed by ansible-galaxy, particularly in the cloudalchemy.grafana task in dashboards.yml. To resolve this issue, removing the warn line in from the yml file. Run benchmark Connect to the benchmark’s client and run the benchmark with a custom workload. Connect with SSH to the benchmark client, with its IP address retrieved from the client_ssh_host output of Terraform. ssh -i ~/.ssh/redpanda_aws ubuntu@$(terraform output --raw client_ssh_host) On the client, navigate to the /opt/benchmark directory. cd /opt/benchmark Create a workload configuration file. For example, create a .yaml file with one topic, 144 partitions, 500 MBps producer rate, four producers, and four consumers: cat > workloads/1-topic-144-partitions-500mb-4p-4c.yaml << EOF name: 500mb/sec rate; 4 producers 4 consumers; 1 topic with 144 partitions topics: 1 partitionsPerTopic: 144 messageSize: 1024 useRandomizedPayloads: true randomBytesRatio: 0.5 randomizedPayloadPoolSize: 1000 subscriptionsPerTopic: 1 consumerPerSubscription: 4 producersPerTopic: 4 producerRate: 500000 consumerBacklogSizeGB: 0 testDurationMinutes: 30 EOF Alternatively, you can use an existing workload file from the Redpanda repo, in openmessaging-benchmark/driver-redpanda/deploy/workloads/. Workloads from Redpanda vs. Kafka comparison The workloads from the Redpanda vs. Kafka benchmark comparison can be gotten from the chart in the comparison: Create or reuse a client configuration file. This file configures the Redpanda producer and consumer clients, as well as topics. The rest of the guide uses the openmessaging-benchmark/driver-redpanda/redpanda-ack-all-group-linger-1ms.yaml configuration file. Client configuration from Redpanda vs. Kafka comparison The client configuration from the Redpanda vs. Kafka benchmark comparison can be gotten from the code listing in the comparison: topicConfig: | min.insync.replicas=2 flush.messages=1 flush.ms=0 producerConfig: | acks=all linger.ms=1 batch.size=131072 consumerConfig: | auto.offset.reset=earliest enable.auto.commit=false auto.commit.interval.ms=0 max.partition.fetch.bytes=131072 Configure reset=false and manually delete the generated topic after the benchmark completes. Otherwise, when reset=true, the benchmark can fail due to it erroneously trying to delete the _schemas topic. Run the benchmark with your workload and client configuration. sudo bin/benchmark -d \ driver-redpanda/redpanda-ack-all-group-linger-1ms.yaml \ workloads/1-topic-144-partitions-500mb-4p-4c.yaml View benchmark results After a run completes, the benchmark generates results as *.json files in /opt/benchmark. Redpanda provides a Python script, generate_charts.py, to generate charts of benchmark results. To run the script: Copy the results from the client to your local machine. exit; # back to your local machine mkdir ~/results scp -i ~/.ssh/redpanda_aws ubuntu@$(terraform output --raw client_ssh_host):/opt/benchmark/*.json ~/results/ From the root directory of the repository, install the prerequisite packages for the Python script. cd ../../bin # openmessaging-benchmark/bin python3 -m pip -r install requirements.txt To list all options, run the script with the -h flag. ./generate_charts.py -h To generate charts from your ~/results/ directory, first create an ~/output directory, then run the script with --results and --output options set accordingly. mkdir ~/output ./generate_charts.py --results ~/results --output ~/output In ~/output, verify the generated charts are in an HTML page with charts for throughput, publish latency, end-to-end latency, publish rate, and consume rate. Tear down benchmark When done running the benchmark, tear down the Redpanda cluster. terraform destroy -auto-approve Suggested reading Redpanda vs. Apache Kafka: A performance comparison (2022 update) Performance update: Redpanda vs. Kafka with KRaft Why fsync(): Losing unsynced data on a single node leads to global data loss Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution Kafka Compatibility Use Redpanda with the HTTP Proxy API