Optimize I/O

Redpanda relies on its own disk I/O scheduler, and by default, it tells the kernel to use the noop scheduler. Additionally, rpk comes with an embedded database of I/O settings, which are specific combinations of CPUs, SSD types, and VM sizes. Because running software on four VCPUs isn’t the same as running on an EC2 i3.metal with 96 physical cores, Redpanda tries to predict the best known settings for VM cloud types.

rpk iotune is a tool to optimize I/O performance for a specific Redpanda instance and its hardware. It runs benchmarks to capture the read/write IOPS and bandwidth capabilities of a node, then it outputs parameters to an I/O configuration file (io-config.yaml) that Redpanda reads upon startup to optimize itself for the node. rpk iotune by default saves its I/O configuration file to /etc/redpanda/io-config.yaml, and Redpanda by default reads from there at startup.

rpk iotune differs from rpk redpanda tune:

  • rpk iotune runs benchmarks to produce an I/O configuration file that Redpanda reads on startup to optimize its I/O performance.

  • rpk redpanda tune (autotuner) is a suite of tuners that automatically modify Linux kernel settings to optimize performance.

For reference, see rpk iotune and rpk redpanda tune

It isn’t necessary to run rpk iotune each time Redpanda is started, as its I/O output configuration file can be reused in nodes running on the same type of hardware. Reuse an I/O output configuration file by starting Redpanda with the --io-properties-file flag and the path to the file:

rpk redpanda start --io-properties-file '<io-properties-file-path>'

Alternatively, the contents of the I/O configuration file can be converted to a string, and the string can be passed with the --io-properties flag:

rpk redpanda start --io-properties '<io-properties-string>'

Currently in its database of I/O settings, Redpanda has well-known-types for AWS and GCP. On startup, rpk tries to detect the cloud and instance type from the cloud’s metadata API, setting the correct iotune properties.

If access to the metadata API isn’t allowed from the instance, you can hint the desired setup by passing the --well-known-io flag with the cloud vendor, VM type, and storage type:

rpk redpanda start --well-known-io 'aws:i3.xlarge:default'

It can also be specified in the redpanda.yaml configuration file, under the rpk object:

rpk:
  well_known_io: 'gcp:c2-standard-16:nvme'
  • If well-known-io is specified in the config file and also as a flag, then the flag takes precedence.

  • Both --well-known-io and rpk.well_known_io cannot be set at the same time as --io-properties-file or --io-properties.

If a certain cloud vendor, machine type, or storage type isn’t found, or if the metadata isn’t available and no hint is given, then rpk prints a warning and continues to use the default values.