Migrating to Version 4

Redpanda Connect has been at major version 3 for more than two years, during which time it has gained a huge amount of functionality without introducing any breaking changes. However, the number of components, APIs and features that have been deprecated in favour of better solutions has grown steadily and the time has finally come to purge them. There are also some areas of functionality that have been improved with breaking changes.

This document outlines the changes made to Redpanda Connect since V3 and tips for how to migrate to V4 in places where those changes are significant.

Deprecated components removed

All components, features and configuration fields that were marked as deprecated in the latest release of V3 have been removed in V4. In order to detect deprecated components or fields within your existing configuration files you can run the linter from a later release of V3 Redpanda Connect with the --deprecated flag:

benthos lint --deprecated ./configs/*.yaml

This should report all remaining deprecated components. All deprecated components have favoured alternative solutions in V3, so it should be possible to slowly eliminate deprecated aspects of your config using V3 before upgrading.

Unit test directories

The benthos test subcommand no longer walks paths when they are directories. Instead use explicit triple-dot syntax (./dir/...) or wildcard patterns.

New Go module name

For users of the Go plugin APIs the import path of this module needs to be updated to github.com/benthosdev/benthos/v4, like so:

import "github.com/benthosdev/benthos/v4/public/service"

Pulsar components disabled (for now)

There have been multiple issues with the Go Pulsar client libraries. Since some are still outstanding and causing problems with unrelated components the decision has been made to remove the pulsar input and output from standard builds. However, it is still possible to build custom versions of Redpanda Connect with them included by importing the package ./public/components/pulsar:

package main

import (
	"context"

	"github.com/Jeffail/benthos/v3/public/service"

	// Import all plugins defined within the repo.
	_ "github.com/benthosdev/benthos/v4/public/components/all"
	_ "github.com/benthosdev/benthos/v4/public/components/pulsar"
)

func main() {
	service.RunCLI(context.Background())
}

Pipeline threads behavior change

In V3 the pipeline.threads field defaults to 1. If this field is explicitly set to 0 it will automatically match the number of CPUs on the host machine. In V4 this will change so that the default value of pipeline.threads is -1, where this value indicates we should match the number of host CPUs. An explicit value of 0 is still considered valid and functionally equivalent to -1.

Old Style Interpolation Functions Removed

The original style of interpolation functions, where you specify a function name followed by a colon and then any arguments (${!json:foo,1}) has been deprecated (and undocumented) for a while now. What we’ve had instead is a subset of Bloblang allowing you to use functions directly (${! json("foo").from(1) }), but with the old style still supported for backwards compatibility.

However, supporting the old style means our parsing capabilities are weakened and so it is now removed in order to allow more powerful interpolations in the future.

Bloblang changes

The functions meta, root_meta, error and env now return null when the target value does not exist. This is in order to improve consistency across different functions and query types. In cases where a default empty string is preferred you can add .or("") onto the function. In cases where you want to throw an error when the value does not exist you can add .not_null() onto the function.

Root referencing

It is now possible to reference the root of the document being created within a mapping query, i.e. root.hash = root.string().hash("xxhash64").

Env var Docker configuration

Docker builds will no longer come with a default config that contains generated environment variables. This system doesn’t scale at all for complex configuration files and was becoming a challenge to maintain (and also huge). Instead, the new -s flag has been the preferred way to configure Redpanda Connect through arguments and will need to be used exclusively in V4.

It’s worth noting that this does not prevent you from defining your own env var based configuration and adding that to your docker image. It’s entirely possible to copy the config from V3 and have that work, it just won’t be present by default any more.

In order to migrate to the -s flag use the path of the fields you’re setting instead of the generated environment variables, so:

docker run --rm -p 4195:4195 jeffail/benthos \
	-e "INPUT_TYPE=http_server" \
	-e "OUTPUT_TYPE=kafka" \
	-e "OUTPUT_KAFKA_ADDRESSES=kafka-server:9092" \
	-e "OUTPUT_KAFKA_TOPIC=benthos_topic"

Becomes:

docker run --rm -p 4195:4195 jeffail/benthos \
  -s "input.type=http_server" \
  -s "output.type=kafka" \
  -s "output.kafka.addresses=kafka-server:9092" \
  -s "output.kafka.topic=benthos_topic"

Old plugin APIs removed

Any packages from within the lib directory have been removed. Please use only the APIs within the public directory, the API docs count be found on pkg.go.dev, and examples can be found in the benthos-plugin-example repository. These new APIs can be found in V3 so if you have many components you can migrate them incrementally by sticking with V3 until completion.

Many of the old packages within lib can also still be found within internal, if you’re in a pickle you can find some of those APIs and copy/paste them into your own repository.

Caches

All caches that support retries have had their retry/backoff configuration fields modified in order to be more consistent. The new common format is:

retries:
  initial_interval: 1s
  max_interval: 5s
  max_elapsed_time: 30s

In cases where it might be desirable to disable retries altogether (the ristretto cache) there is also an enabled field.

TTL changes

Caches that support TTLs have had their ttl fields renamed to default_ttl in order to make it clearer that their purpose is to provide a fallback. All of these values are now duration string types, i.e. a cache with an integer seconds based field with a previous value of 60 should now be defined as 60s.

Field default changes

Lots of fields have had default values removed in cases where they were deemed unlikely to be useful and likely to cause frustration. This specifically applies to any url, urls, address or addresses fields that may have once had a default value containing a common example for the particular service. In most cases this should cause minimal disruption as the field is non-optional and therefore not specifying it explicitly will result in config errors.

However, there are the following exceptions that are worth noting:

The http processor and http_client output no longer create multipart requests by default

The http processor and http_client output now execute message batch requests as individual requests by default. This behavior can be disabled by explicitly setting batch_as_multipart to true.

Output lines codec no longer adds extra batch newlines

All outputs that traditionally wrote empty newlines at the end of batches with >1 message when using the lines codec (socket, stdout, file, sftp) no longer do this by default. This was originally kept for backwards compatibility but was often seen as an unexpected and annoying behavior.

It is still possible to add these end-of-batch newlines in a more consistent way by either adding an empty message to the end of batches, or by adding a newline to the last message of the batch.

The switch output retry_until_success

By default the switch output continues retrying switch case outputs until success. This default was sensible at the time as we didn’t have a concept of intentionally nacking messages, and therefore a nacked message was likely a recoverable problem and retrying internally means that messages matching multiple cases wouldn’t produce duplicates.

However, since then Redpanda Connect has evolved and a very common pattern with the switch output is to reject messages that failed during processing using the reject output. But because of the default value of retry_until_success many users end up in a confusing situation where using a reject output results in the pipeline blocking indefinitely until they discover this field.

Therefore the default value of retry_until_success will now be false, which means users that aren’t using a reject flow in one of their switch cases, and have a configuration where messages could match multiple cases, should explicitly set this field to true in order to avoid potential duplicates during downstream outages.

AWS region fields

Any configuration sections containing AWS fields no longer have a default region of eu-west-1. Instead, the field will be empty by default, where unless explicitly set the environment variable AWS_REGION will be used. This will cause problems for users where they expect the region eu-west-1 to be targeted when neither the field nor the environment variable AWS_REGION are set.

Clickhouse driver changes

The clickhouse SQL driver Data Source Name format parameters have been changed due to a client library update (details can be found at https://github.com/ClickHouse/clickhouse-go). A compatibility layer has been added that makes a best attempt to translate the old DSN format to the new one, but some parameters may not carry over exactly.

This update also means placeholders in sql_raw queries should be in dollar syntax.

Serverless default output

The default output of the serverless distribution of Redpanda Connect is now the following config:

output:
  switch:
    retry_until_success: false
    cases:
      - check: errored()
        output:
          reject: "processing failed due to: ${! error() }"
      - output:
          sync_response: {}

This change was made in order to return processing errors directly to the invoker by default.

Metrics changes

The metrics produced by a Redpanda Connect stream have been greatly simplified and now make better use of labels/tags in order to provide component-specific insights. The configuration and behavior of metrics types has also been made more consistent, with metric names being the same throughout and mapping now being a general top-level field.

For a full overview of the new system check out the metrics about page.

Field prefix is gone

Some metrics components such as prometheus had a prefix field for setting a prefix to all metric names. These fields are now gone, if you want to reintroduce these prefixes you can use the general purpose mapping field. For example, if we previously had a config:

metrics:
  prometheus:
    prefix: ${METRICS_PREFIX:benthos}

We need to delete that prefix and add a mapping that renames metric names:

metrics:
  mapping: 'root = env("METRICS_PREFIX").or("benthos") + "_" + this'
  prometheus: {}

The http_server type renamed to json_api

The name given to the generic JSON API metrics type was http_server, which was confusing as it isn’t the only metrics output type that presents as an HTTP server endpoint. This type was also only originally intended for local debugging, which the prometheus type is also good for.

In order to distinguish this metrics type by its unique feature, which is that it exposes metrics as a JSON object, it has been renamed to json_api.

The stdout type renamed to logger

The stdout metrics type now emits metrics using the Redpanda Connect logger, and therefore also matches the logger format. As such, it has been renamed to logger in order to reflect that.

No more dots

In V3 metrics names contained dots in order to represent pseudo-paths of the source component. In V4 all metric names produced by Redpanda Connect have been changed to contain only alpha-numeric characters and underscores. It is recommended that any custom metric names produced by your metric processors and custom plugins should match this new format for consistency.

Since dots were invalid characters in Prometheus metric names, in V3 the prometheus metrics type made some automatic modifications to all names before registering them. This rewrite first replaced all - and characters to a double underscore (_), and then replaced all . characters with _. This was an ugly work around and has been removed in V4, but means in previous cases where custom metrics containing dots were automatically converted you will instead see error logs reporting that the names were invalid and therefore ignored.

If you wish to retain the old rewrite behavior you can reproduce it with the new mapping field:

metrics:
  mapping: 'root = this.replace("_", "__").replace("-", "__").replace(".", "_")'
  prometheus: {}

However, it’s recommended to change your metric names instead.

Tracing changes

Distributed tracing within Redpanda Connect is now done via the Open Telemetry client library. Unfortunately, this client library does not support the full breadth of options as we had before. As such, the jaeger tracing type now only supports the const sampling type, and the field service_name has been removed.

This will likely mean tracing output will appear different in this release, and if you were relying on code that extracts and interacts with spans from messages in your custom plugins then it will need to be converted to use the official Open Telemetry APIs.

Logging changes

The logger config section has been simplified, the new default set to logfmt, and the classic format removed. The default value of add_timestamp has also been changed to false.

Automatic max In flight

Outputs that compose other outputs (broker, switch, etc) no longer require their own max_in_flight settings as they will automatically saturate their composed outputs. This includes outputs that compose resources.

Processor batch behavior changes

Some processors that once executed only once per batch have been updated to execute upon each message individually by default. This change has been made because it was felt the individual message case was considerably more common (and intuitive) and that it is possible to satisfy the batch-wide behavior in other ways that are opt-in, such as by placing the processors within a branch and having your request_map explicit for a single batch_index (i.e. request_map: root = if batch_index() != 0 { deleted() }).

Processor parts field removed

Many processors previously had a parts field, which allowed you to explicitly list the indexes of a batch to apply the processor to. This field had confusing naming and was rarely used (or even known about). Since that same behavior can be reproduced by placing the processor within a branch (or switch) all parts fields have been removed.

dedupe

The dedupe processor has been reworked so that it now acts upon individual messages by default. It’s now mandatory to specify a key, and the parts and hash fields have been removed. Instead, specify full-content hashing with interpolation functions in the key field, e.g. ${! content().hash("xxhash64") }.

In order to deduplicate an entire batch it is likely easier to use a cache processor with the add operator:

pipeline:
  processors:
    # Try and add one message to a cache that identifies the whole batch
    - branch:
        request_map: |
          root = if batch_index() == 0 {
            this.id
          } else { deleted() }
        processors:
          - cache:
              operator: add
              key: ${! content() }
              value: t
    # Delete all messages if we failed
    - mapping: |
        root = if errored().from(0) {
          deleted()
        }

log

The log processor now executes for every message of batches by default.

sleep

The sleep processor now executes for every message of batches by default.

Broker ditto macro gone

The hidden macro ditto for broker configs is now removed. Use the copies field instead. For some edge cases where copies does not satisfy your requirements you may be better served using configuration templates. If all else fails then please reach out and we can look into other solutions.