Optimize I/O
Redpanda relies on its own disk I/O scheduler, and by default, it tells the kernel to
use the noop
scheduler. Additionally, rpk
comes with an embedded database of I/O settings, which are specific combinations of CPUs, SSD types, and VM sizes. Because running software on four VCPUs isn’t the same as running on an EC2 i3.metal with 96 physical cores, Redpanda tries to predict the best known settings for VM cloud types.
rpk iotune is a tool to optimize I/O performance for a specific Redpanda instance and its hardware. It runs benchmarks to capture the read/write IOPS and bandwidth capabilities of a node, then it outputs parameters to an I/O configuration file (io-config.yaml
) that Redpanda reads upon startup to optimize itself for the node. rpk iotune
by default saves its I/O configuration file to /etc/redpanda/io-config.yaml
, and Redpanda by default reads from there at startup.
For reference, see rpk iotune and rpk redpanda tune |
It isn’t necessary to run rpk iotune
each time Redpanda is started, as its I/O output configuration file can be reused in nodes running on the same type of hardware. Reuse an I/O output configuration file by starting Redpanda with the --io-properties-file
flag and the path to the file:
rpk redpanda start --io-properties-file '<io-properties-file-path>'
Alternatively, the contents of the I/O configuration file can be converted to a string, and the string can be passed with the --io-properties
flag:
rpk redpanda start --io-properties '<io-properties-string>'
Currently in its database of I/O settings, Redpanda has well-known-types for AWS and GCP. On startup, rpk
tries to detect the cloud and instance type from the cloud’s metadata API, setting the correct iotune
properties.
If access to the metadata API isn’t allowed from the instance, you can hint the desired setup by passing the --well-known-io
flag with the cloud vendor, VM type, and storage type:
rpk redpanda start --well-known-io 'aws:i3.xlarge:default'
It can also be specified in the redpanda.yaml
configuration file, under the rpk
object:
rpk:
well_known_io: 'gcp:c2-standard-16:nvme'
|
If a certain cloud vendor, machine type, or storage type isn’t
found, or if the metadata isn’t available and no hint is given, then rpk
prints a
warning and continues to use the default values.