Configure Producers

Producers are client applications that write data to Redpanda in the form of events. Producers communicate with Redpanda through the Kafka API.

When a producer publishes a message to a Redpanda cluster, it sends it to a specific partition. Every event consists of a key and value. When selecting which partition to produce to, if the key is blank, then the producer publishes in a round-robin fashion between the topic’s partitions. If a key is provided, then the partition hashes the key using the murmur2 algorithm and modulates across the number of partitions.

Producer acknowledgment settings

The acks property sets the number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent.

Redpanda guarantees data safety with fsync, which means flushing to disk.

  • With acks=all, every write is fsynced by default.

  • With other acks settings, or with write_caching_default=true at the cluster level, Redpanda fsyncs to disk according to raft_replica_max_pending_flush_bytes and raft_replica_max_flush_delay_ms, whichever is reached first.

  • With write.caching enabled at the topic level, Redpanda fsyncs to disk according to flush.ms and flush.bytes, whichever is reached first.

acks=0

The producer doesn’t wait for acknowledgments from the leader and doesn’t retry sending messages. This increases throughput and lowers latency of the system at the expense of durability and data loss.

This option allows a producer to immediately consider a message acknowledged when it is sent to the Redpanda broker. This means that a producer does not have to wait for any response from the Redpanda broker. This is the least safe option, because a leader-broker crash can cause data loss if the data has not yet replicated to the other brokers in the replica set. However, this setting is useful when you want to optimize for the highest throughput and are willing to risk some data loss.

Because of the lack of guarantees, this setting is the most network bandwidth-efficient. This is helpful for use cases like IoT/sensor data collection, where updates are periodic or stateless and you can afford some degree of data loss, but you want to gather as much data as possible in a given time interval.

acks=1

The producer waits for an acknowledgment from the leader, but it doesn’t wait for the leader to get acknowledgments from followers. This setting doesn’t prioritize throughput, latency, or durability. Instead, acks=1 attempts to provide a balance between all of them.

Replication is not guaranteed with this setting because it happens in the background, after the leader broker sends an acknowledgment to the producer. This setting could result in data loss if the leader broker crashes before any followers manage to replicate the message or if a majority of replicas go down at the same time before fsyncing the message to the disk.

acks=all

The producer receives an acknowledgment after the majority of (implicitly, all) replicas acknowledge the message. Redpanda guarantees data safety by fsyncing every message to disk before acknowledgement back to clients. This increases durability at the expense of lower throughput and increased latency.

Sometimes referred to as acks = -1, this option instructs the broker that replication is considered complete when the message has been replicated (and fsynced) to the majority of the brokers responsible for the partition in the cluster. As soon as the fsync call is complete, the message is considered acknowledged and is made visible to readers.

This property has an important distinction compared to Kafka’s behavior. In Kafka, a message is considered acknowledged without the requirement that it has been fsynced. Messages that have not been fsynced to disk may be lost in the event of a broker crash. So when using acks=all, the Redpanda default configuration is more resilient than Kafka’s. You can also consider using write caching, which is a relaxed mode of acks=all that acknowledges a message as soon as it is received and acknowledged on a majority of brokers, without waiting for it to fsync to disk. This provides lower latency while still ensuring that a majority of brokers acknowledge the write.

retries

This property controls the number of times a message is re-sent to the broker if the broker fails to acknowledge it. This is essentially the same as if the client application resends the erroneous message after receiving an error response. The default value of retries in most client libraries is 0. This means that if the send fails, the message is not re-sent at all.

If you increase this to a higher value, check the max.in.flight.requests.per.connection value as well, because leaving that property at its default value can potentially cause ordering issues in the target topic where the messages arrive. This occurs if two batches are sent to a single partition and the first fails and is retired, but the second succeeds so the records in the second batch may appear first.

max.in.flight.requests.per.connection

This property controls how many unacknowledged messages can be sent to the broker simultaneously at any given time. The default value is 5 in most client libraries.

If you set this to 1, then the producer does not send any more messages until the previous one is either acknowledged or an error happens, which can prompt a retry. If you set this to a value higher than 1, then the producer sends more messages at the same time, which can help increase throughput but adds a risk of message reordering if retries are enabled.

When you configure the producer to be idempotent, up to five requests can be guaranteed to be in flight with the order preserved.

enable.idempotence

To enable idempotence, set enable.idempotence to true (the default) in your Redpanda configuration.

When idempotence is enabled, the producer ensures that exactly one copy of every message is written to the broker. When set to false, the producer retries sending a message for any reason (such as transient errors like brokers not being available or not enough replicas exception), and it can lead to duplicates.

In most client libraries enable.idempotence is set to true by default. Internally, this is implemented using a special identifier that is assigned to every producer (the producer ID or PID). This ID, along with a sequence number, is included in every message sent to the broker. The broker checks if the PID/sequence number combination is larger than the previous one and, if not, it discards the message.

To guarantee true idempotent behavior, you must also set acks=all to ensure that all brokers record messages in order, even in the event of node failures. In this configuration, both the producer and the broker prefer safety and durability over throughput.

Idempotence is only guaranteed within a session. A session starts after the producer is instantiated and a connection is established between the client and the Redpanda broker. When the connection is closed, the session ends.

If your application code retries a request, the producer client assigns a new ID to that request, which may lead to duplicate messages.

Message batching

Batching is an efficient way to save on both network bandwidth and disk size, because messages can be compressed easier.

When a producer prepares to send messages to a broker, it first fills up a buffer. When this buffer is full, the producer compresses (if instructed to do so) and sends out this batch of messages to the broker. The number of batches that can be sent in a single request to the broker is limited by the max.request.size property. The number of requests that can simultaneously be in this sending state is controlled by the max.in.flight.requests.per.connection value, which defaults to 5 in most client libraries.

Tune the batching configuration with the following properties:

buffer.memory

This property controls the total amount of memory available to the producer for buffering. If messages are sent faster than they can be delivered to the broker, the producer application may run out of memory, which causes it to either block subsequent send calls or throw an exception. The max.block.ms property controls the amount of time the producer blocks before throwing an exception if it cannot immediately send messages to the broker.

batch.size

This property controls the maximum size of coupled messages that can be batched together in one request. The producer automatically puts messages being sent to the same partition into one batch. This configuration property is given in bytes, as opposed to the number of messages.

When the producer is gathering messages to assign to a batch, at some point it hits this byte-size limit, which triggers it to send the batch to the broker. However, the producer does not necessarily wait (for as much time as set using linger.ms) until the batch is full. Sometimes, it can even send single-message batches. This means that setting the batch size too large is not necessarily undesirable, because it won’t cause throttling when sending messages; rather, it only causes increased memory usage.

Conversely, setting the batch size too small can cause the producer to send batches of messages faster, which can cause network overhead, meaning a reduced throughput. The default value is usually 16384, but you can set this as low as 0, which turns off batching entirely.

linger.ms

This property controls the maximum amount of time the producer waits before sending out a batch of messages, if it is not already full. This means you can somewhat force the producer to make sure that batches are filled as efficiently as possible.

If you’re willing to tolerate some latency, setting this value to a number larger than the default of 0 causes the producer to send fewer, more efficient batches of messages. If you set the value to 0, there is still a high chance messages arrive around the same time to be batched together.

Common producer configurations

compression.type

This property controls how the producer should compress a batch of messages before sending it to the broker. The default is none, which means the batch of messages is not compressed at all. Compression occurs on full batches, so you can improve batching throughput by setting this property to use one of the available compression algorithms (along with increasing batch size). The available options are: zstd, lz4, gzip, and snappy.

Serializers

Serializers are responsible for converting a message to a byte array. You can influence the speed/memory efficiency of your streaming setup by choosing one of the built-in serializers or writing a custom one. The performance consequences of using serializers is not typically significant.

For example, if you opt for the JSON serializer, you have more data to transport with each message because every record contains its schema in a verbose format, which impacts your compression speeds and network throughput. Alternatively, going with AVRO or Protobuf allows you to only define the schema in one place, while also enabling features like schema evolution.

Broker timestamps

Redpanda employs a unique strategy to help ensure the accuracy of retention operations. In this strategy, closed segments are only eligible for deletion when the age of all messages in the segment exceeds a configured threshold. However, when a producer sends a message to a topic, the timestamp set by the producer may not accurately reflect the time the message reaches the broker. To address this time skew, each time a producer sends a message to a topic, Redpanda records the broker’s system date and time in the broker_timestamp property of the message. This property helps maintain accurate retention policies, even when the message’s creation timestamp deviates from the broker’s time.

Configure broker timestamp alerting

Each time a broker receives a message with a skewed timestamp that is outside a configured range, Redpanda increments the vectorized_kafka_rpc_produce_bad_create_time metric. Two cluster properties control this range. The minimum accepted value for both of these properties is five minutes. Any attempt to set a value lower than that is rejected by Redpanda.

  • log_message_timestamp_alert_before_ms: Defines the allowed skew before the broker’s time. This check is effectively disabled when the value is set to null. Minimum: 300000 ms (5 minutes), Default: null.

  • log_message_timestamp_alert_after_ms: Defines the allowed skew after the broker’s time. There is no way to disable this check. Minimum: 300000 ms (5 minutes), Default: 7200000 ms (2 hours).

Disable broker timestamp retention

While not advised for typical use, Redpanda lets you override the use of broker timestamps for retention policy with the Admin API. Use the activate feature API to disable the broker_time_based_retention property.

If you disable this feature, make sure to specify your desired timestamp policy. This is stored in the log_message_timestamp_type cluster property. The timestamp policy defaults to CreateTime (client timestamp set by producer) but may be updated to LogAppendTime (server timestamp set by Redpanda).

Producer optimization strategies

You can optimize for speed (throughput and latency) or safety (durability and availability) by adjusting properties. Finding the optimal configuration depends on your use case.

There are many configuration options within Redpanda. The configuration options mentioned here work best when combined with other broker and consumer configuration options. For details, see Configure Broker Properties and Consumer Offsets.

Optimize for speed

To get data into Redpanda as quickly as possible, you can maximize latency and throughput in a variety of ways:

  • Experiment with acks settings. The quicker a producer receives a reply from the broker that the message has been committed, the sooner it can send the next message, which generally results in higher throughput. Hence, if you set acks=1, then the leader broker does not need to wait for replication to occur, and it can reply as soon as it finishes committing the message. This can result in less durability overall.

  • Enable write caching, which acknowledges a message as soon as it is received and acknowledged on a majority of brokers, without waiting for it to fsync to disk. This provides lower latency while still ensuring that a majority of brokers acknowledge the write.

  • Experiment with other component’s properties, like the topic partition size.

  • Explore how the producer batches messages. Increasing the value of batch.size and linger.ms can increase throughput by making the producer add more messages into one batch before sending it to the broker and waiting until the batches can properly fill up. This approach negatively impacts latency though. By contrast, if you set linger.ms to 0 and batch.size to 1, you can achieve lower latency, but sacrifice throughput.

Optimize for safety

For applications where you must guarantee that there are no lost messages, duplicates, or service downtime, you can use higher durability acks settings. If you set acks=all, then the producer waits for a majority of replicas to acknowledge the message before it can send the next message, resulting in lower latency, because there is more communication required between brokers. This approach can guarantee higher durability because the message is replicated to all brokers.

You can also increase durability by increasing the number of retries the broker can make in case messages are not delivered successfully. The trade-off is that duplicates may enter the system and potentially alter the ordering of messages.