kafka

Available in: Cloud, Self-Managed

Deprecated in 4.68.0

This component is deprecated and will be removed in the next major version release. Please consider moving onto the unified redpanda input and redpanda output components.

The kafka output writes a batch of messages to Kafka brokers and waits for acknowledgement before propagating any acknowledgements back to the input.

Common
Advanced

outputs:
  label: ""
  kafka:
    addresses: [] # No default (required)
    topic: "" # No default (required)
    target_version: "" # No default (optional)
    key: ""
    partitioner: fnv1a_hash
    compression: none
    static_headers: "" # No default (optional)
    metadata:
      exclude_prefixes: []
    max_in_flight: 64
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)

outputs:
  label: ""
  kafka:
    addresses: [] # No default (required)
    tls:
      enabled: false
      skip_cert_verify: false
      enable_renegotiation: false
      root_cas: ""
      root_cas_file: ""
      client_certs:
        cert: ""
        key: ""
        cert_file: ""
        key_file: ""
        password: ""
    sasl:
      mechanism: none
      user: ""
      password: ""
      access_token: ""
      token_cache: ""
      token_key: ""
    topic: "" # No default (required)
    client_id: benthos
    target_version: "" # No default (optional)
    rack_id: ""
    key: ""
    partitioner: fnv1a_hash
    partition: ""
    custom_topic_creation:
      enabled: false
      partitions: -1
      replication_factor: -1
    compression: none
    static_headers: "" # No default (optional)
    metadata:
      exclude_prefixes: []
    inject_tracing_map: "" # No default (optional)
    max_in_flight: 64
    idempotent_write: false
    ack_replicas: false
    max_msg_bytes: 1000000
    timeout: 5s
    retry_as_batch: false
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
    max_retries: 0
    backoff:
      initial_interval: 3s
      max_interval: 10s
      max_elapsed_time: 30s
    timestamp_ms: "" # No default (optional)

The configuration field ack_replicas determines whether Redpanda Connect waits for acknowledgement from all replicas or just a single broker.

Both the key and topic fields can be dynamically set using function interpolations described in Bloblang queries.

Metadata will be added to each message sent as headers (version 0.11+), but can be restricted using the field metadata.

Strict ordering and retries

When strict ordering is required for messages written to topic partitions it is important to ensure that both the field max_in_flight is set to 1 and that the field retry_as_batch is set to true.

You must also ensure that failed batches are never rerouted back to the same output. This can be done by setting the field max_retries to 0 and backoff.max_elapsed_time to empty, which will apply back pressure indefinitely until the batch is sent successfully.

However, this also means that manual intervention will eventually be required in cases where the batch cannot be sent due to configuration problems such as an incorrect max_msg_bytes estimate. A less strict but automated alternative would be to route failed batches to a dead letter queue using a fallback broker, but this would allow subsequent batches to be delivered in the meantime whilst those failed batches are dealt with.

Troubleshooting

If you’re seeing issues writing to or reading from Kafka with this component then it’s worth trying out the newer kafka_franz output.

I’m seeing logs that report Failed to connect to kafka: kafka: client has run out of available brokers to talk to (Is your cluster reachable?), but the brokers are definitely reachable.

Unfortunately this error message will appear for a wide range of connection problems even when the broker endpoint can be reached. Double check your authentication configuration and also ensure that you have enabled TLS if applicable.

Performance

This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field max_in_flight.

This output benefits from sending messages as a batch for improved performance. Batches can be formed at both the input and output level. You can find out more in this doc.

Fields

`ack_replicas`

Ensure that messages have been copied across all replicas before acknowledging receipt.

Type: bool

Default: false

`addresses[]`

A list of broker addresses to connect to. If an item of the list contains commas it will be expanded into multiple addresses.

Type: array

# Examples:
addresses:
  - "localhost:9092"

# ---

addresses:
  - "localhost:9041,localhost:9042"

# ---

addresses:
  - "localhost:9041"
  - "localhost:9042"

`backoff`

Control time intervals between retry attempts.

Type: object

`backoff.initial_interval`

The initial period to wait between retry attempts. The retry interval increases for each failed attempt, up to the backoff.max_interval value. This field accepts Go duration format strings such as 100ms, 1s, or 5s.

Type: string

Default: 3s

# Examples:
initial_interval: 50ms

# ---

initial_interval: 1s

`backoff.max_elapsed_time`

The maximum overall period of time to spend on retry attempts before the request is aborted. Setting this value to a zeroed duration (such as 0s) will result in unbounded retries.

Type: string

Default: 30s

# Examples:
max_elapsed_time: 1m

# ---

max_elapsed_time: 1h

`backoff.max_interval`

The maximum period to wait between retry attempts

Type: string

Default: 10s

# Examples:
max_interval: 5s

# ---

max_interval: 1m

`batching`

Allows you to configure a batching policy.

Type: object

# Examples:
batching:
  byte_size: 5000
  count: 0
  period: 1s

# ---

batching:
  count: 10
  period: 1s

# ---

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m

`batching.byte_size`

An amount of bytes at which the batch should be flushed. If 0 disables size based batching.

Type: int

Default: 0

`batching.check`

A Bloblang query that should return a boolean value indicating whether a message should end a batch.

Type: string

Default: ""

# Examples:
check: this.type == "end_of_transaction"

`batching.count`

A number of messages at which the batch should be flushed. If 0 disables count based batching.

Type: int

Default: 0

`batching.period`

A period in which an incomplete batch should be flushed regardless of its size.

Type: string

Default: ""

# Examples:
period: 1s

# ---

period: 1m

# ---

period: 500ms

`batching.processors[]`

A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.

Type: processor

# Examples:
processors:
  - archive:
      format: concatenate


# ---

processors:
  - archive:
      format: lines


# ---

processors:
  - archive:
      format: json_array

`client_id`

An identifier for the client connection.

Type: string

Default: benthos

`compression`

The compression algorithm to use.

Type: string

Default: none

Options: none, snappy, lz4, gzip, zstd

`custom_topic_creation`

If enabled, topics will be created with the specified number of partitions and replication factor if they do not already exist.

Type: object

`custom_topic_creation.enabled`

Whether to enable custom topic creation.

Type: bool

Default: false

`custom_topic_creation.partitions`

The number of partitions to create for new topics. Leave at -1 to use the broker configured default. Must be >= 1.

Type: int

Default: -1

`custom_topic_creation.replication_factor`

The replication factor to use for new topics. Leave at -1 to use the broker configured default. Must be an odd number, and less then or equal to the number of brokers.

Type: int

Default: -1

`idempotent_write`

Enable the idempotent write producer option. This requires the IDEMPOTENT_WRITE permission on CLUSTER and can be disabled if this permission is not available.

Type: bool

Default: false

`inject_tracing_map`

EXPERIMENTAL: A Bloblang mapping used to inject an object containing tracing propagation information into outbound messages. The specification of the injected fields will match the format used by the service wide tracer.

Type: string

# Examples:
inject_tracing_map: meta = @.merge(this)

# ---

inject_tracing_map: root.meta.span = this

`key`

An optional key to populate for each message. This field supports interpolation functions.

Type: string

Default: ""

`max_in_flight`

The maximum number of messages to have in flight at a given time. Increase this to improve throughput.

Type: int

Default: 64

`max_msg_bytes`

The maximum size in bytes of messages sent to the target topic.

Type: int

Default: 1000000

`max_retries`

The maximum number of retries before giving up on the request. If set to zero there is no discrete limit.

Type: int

Default: 0

`metadata`

Specify criteria for which metadata values are sent with messages as headers.

Type: object

`metadata.exclude_prefixes[]`

Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages.

Type: array

Default: []

`partition`

The manually-specified partition to publish messages to, relevant only when the field partitioner is set to manual. Must be able to parse as a 32-bit integer. This field supports interpolation functions.

Type: string

Default: ""

`partitioner`

The partitioning algorithm to use.

Type: string

Default: fnv1a_hash

Options: fnv1a_hash, murmur2_hash, random, round_robin, manual

`rack_id`

A rack identifier for this client.

Type: string

Default: ""

`retry_as_batch`

When enabled forces an entire batch of messages to be retried if any individual message fails on a send, otherwise only the individual messages that failed are retried. Disabling this helps to reduce message duplicates during intermittent errors, but also makes it impossible to guarantee strict ordering of messages.

Type: bool

Default: false

`sasl`

Enables SASL authentication.

Type: object

`sasl.access_token`

A static OAUTHBEARER access token

Type: string

Default: ""

`sasl.mechanism`

The SASL authentication mechanism, if left empty SASL authentication is not used.

Type: string

Default: none

Option Summary

Option	Summary
`OAUTHBEARER`	OAuth Bearer based authentication.
`PLAIN`	Plain text authentication. NOTE: When using plain text auth it is extremely likely that you’ll also need to enable TLS.
`SCRAM-SHA-256`	Authentication using the SCRAM-SHA-256 mechanism.
`SCRAM-SHA-512`	Authentication using the SCRAM-SHA-512 mechanism.
`none`	Default, no SASL authentication.

OAUTHBEARER

OAuth Bearer based authentication.

PLAIN

Plain text authentication. NOTE: When using plain text auth it is extremely likely that you’ll also need to enable TLS.

SCRAM-SHA-256

Authentication using the SCRAM-SHA-256 mechanism.

SCRAM-SHA-512

Authentication using the SCRAM-SHA-512 mechanism.

none

Default, no SASL authentication.

`sasl.password`

A PLAIN password. It is recommended that you use environment variables to populate this field.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration.

Type: string

Default: ""

# Examples:
password: ${PASSWORD}

`sasl.token_cache`

Instead of using a static access_token allows you to query a cache resource to fetch OAUTHBEARER tokens from

Type: string

Default: ""

`sasl.token_key`

Required when using a token_cache, the key to query the cache with for tokens.

Type: string

Default: ""

`sasl.user`

A PLAIN username. It is recommended that you use environment variables to populate this field.

Type: string

Default: ""

# Examples:
user: ${USER}

`static_headers`

An optional map of static headers that should be added to messages in addition to metadata.

Type: string

# Examples:
static_headers:
  first-static-header: value-1
  second-static-header: value-2

`target_version`

The version of the Kafka protocol to use. This limits the capabilities used by the client and should ideally match the version of your brokers. Defaults to the oldest supported stable version.

Type: string

# Examples:
target_version: 2.1.0

# ---

target_version: 3.1.0

`timeout`

The maximum period of time to wait for message sends before abandoning the request and retrying.

Type: string

Default: 5s

`timestamp_ms`

Set a timestamp (in milliseconds) for each message (optional). When left empty, the current timestamp is used. This field supports interpolation functions.

Type: string

# Examples:
timestamp_ms: ${! timestamp_unix_milli() }

# ---

timestamp_ms: ${! metadata("kafka_timestamp_ms") }

`tls`

Custom TLS settings can be used to override system defaults.

Type: object

`tls.client_certs[]`

A list of client certificates to use. For each certificate either the fields cert and key, or cert_file and key_file should be specified, but not both.

Type: object

Default: []

# Examples:
client_certs:
  - cert: foo
    key: bar


# ---

client_certs:
  - cert_file: ./example.pem
    key_file: ./example.key

`tls.client_certs[].cert`

A plain text certificate to use.

Type: string

Default: ""

`tls.client_certs[].cert_file`

The path of a certificate to use.

Type: string

Default: ""

`tls.client_certs[].key`

A plain text certificate key to use.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration.

Type: string

Default: ""

`tls.client_certs[].key_file`

The path of a certificate key to use.

Type: string

Default: ""

`tls.client_certs[].password`

A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC algorithm is not supported for the PKCS#8 format.

Because the obsolete pbeWithMD5AndDES-CBC algorithm does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration.

Type: string

Default: ""

# Examples:
password: foo

# ---

password: ${KEY_PASSWORD}

`tls.enable_renegotiation`

Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you’re seeing the error message local error: tls: no renegotiation.

Type: bool

Default: false

`tls.enabled`

Whether custom TLS settings are enabled.

Type: bool

Default: false

`tls.root_cas`

An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration.

Type: string

Default: ""

# Examples:
root_cas: |-
  -----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----

`tls.root_cas_file`

An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate.

Type: string

Default: ""

# Examples:
root_cas_file: ./root_cas.pem

`tls.skip_cert_verify`

Whether to skip server side certificate verification.

Type: bool

Default: false

`topic`

The topic to publish messages to. This field supports interpolation functions.

Type: string

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution

What do you think of this page?

Let us know more:

Let us contact you about your feedback:

kafka

Strict ordering and retries

Troubleshooting

Performance

Fields

ack_replicas

addresses[]

backoff

backoff.initial_interval

backoff.max_elapsed_time

backoff.max_interval

batching

batching.byte_size

batching.check

batching.count

batching.period

batching.processors[]

client_id

compression

custom_topic_creation

custom_topic_creation.enabled

custom_topic_creation.partitions

custom_topic_creation.replication_factor

idempotent_write

inject_tracing_map

key

max_in_flight

max_msg_bytes

max_retries

metadata

metadata.exclude_prefixes[]

partition

partitioner

rack_id

retry_as_batch

sasl

sasl.access_token

sasl.mechanism

sasl.password

sasl.token_cache

sasl.token_key

sasl.user

static_headers

target_version

timeout

timestamp_ms

tls

tls.client_certs[]

tls.client_certs[].cert

tls.client_certs[].cert_file

tls.client_certs[].key

tls.client_certs[].key_file

tls.client_certs[].password

tls.enable_renegotiation

tls.enabled

tls.root_cas

tls.root_cas_file

tls.skip_cert_verify

topic

Simple online edits

Contribution guide

`ack_replicas`

`addresses[]`

`backoff`

`backoff.initial_interval`

`backoff.max_elapsed_time`

`backoff.max_interval`

`batching`

`batching.byte_size`

`batching.check`

`batching.count`

`batching.period`

`batching.processors[]`

`client_id`

`compression`

`custom_topic_creation`

`custom_topic_creation.enabled`

`custom_topic_creation.partitions`

`custom_topic_creation.replication_factor`

`idempotent_write`

`inject_tracing_map`

`key`

`max_in_flight`

`max_msg_bytes`

`max_retries`

`metadata`

`metadata.exclude_prefixes[]`

`partition`

`partitioner`

`rack_id`

`retry_as_batch`

`sasl`

`sasl.access_token`

`sasl.mechanism`

`sasl.password`

`sasl.token_cache`

`sasl.token_key`

`sasl.user`

`static_headers`

`target_version`

`timeout`

`timestamp_ms`

`tls`

`tls.client_certs[]`

`tls.client_certs[].cert`

`tls.client_certs[].cert_file`

`tls.client_certs[].key`

`tls.client_certs[].key_file`

`tls.client_certs[].password`

`tls.enable_renegotiation`

`tls.enabled`

`tls.root_cas`

`tls.root_cas_file`

`tls.skip_cert_verify`

`topic`