Migrating to Version 3

Redpanda Connect version 3 comes with some breaking service and configuration changes as well as a few breaking API changes.

Service

Memory map file buffer removed

The long deprecated mmap_file buffer has been removed. If you were still relying on this buffer implementation then please raise an issue.

Old metrics paths removed

The following undocumented metrics paths have been removed:

  • input.parts.count

  • input.read.success

  • input.read.error

  • input.send.success

  • input.send.error

  • input.ack.success

  • input.ack.error

  • output.running

  • output.parts.count

  • output.send.success

  • output.parts.send.success

  • output.send.error

All of these paths have remaining equivalents.

Metric path changes

The http_client output client metrics have been renamed from output..output.http_client to output..client.

Configuration

Metrics prefix

The configuration field prefix within metrics has been moved from the root of the config object to individual types. E.g. when using statsd the field metrics.prefix should be replaced with metrics.statsd.prefix.

JSON paths

Many components within Redpanda Connect use an unspecified "JSON dot path" syntax for querying and setting fields within JSON documents. The format of these paths has been formalized to make them clearer and more generally useful, but this potentially breaks your paths when they query against hierarchies that contain arrays.

The formal specification for v3 can be found in this document.

The following components are affected:

  • awk processor (all of the json_* functions)

  • json processor (path field)

  • process_field processor (path field)

  • process_map processor (premap, premap_optional, postmap and postmap_optional fields)

  • check_field condition (path field)

  • json_field function interpolation

  • s3 input (sqs_body_path, sqs_bucket_path and sqs_envelope_path fields)

  • dynamodb output (json_map_columns field values)

Migration guide

In order to replicate the exact same behavior as currently exists your paths should be updated to include the character * wherever an array exists. For example, the default value of sqs_body_path for the s3 input has been updated from Records.s3.object.key to Records.*.s3.object.key.

Process DAG stage names

The process_dag processor now only permits workflow stage names matching the following regular expression: [a-zA-Z0-9_-]+. The reasoning for this restriction is to potentially expand the features of process_dag in the future with custom root fields (e.g. $on_error).

Go API

Modules

Redpanda Connect now fully adheres to Go Modules, import paths must therefore now contain the major version (v3) like so:

import "github.com/Jeffail/benthos/v3/lib/processor"

It should be pretty quick to update your imports, either using a tool or just:

grep "Jeffail/benthos" . -Rl | grep -e "\.go$" | xargs -I{} sed -i 's/Jeffail\/benthos/Jeffail\/benthos\/v3/g' {}

Other

  • Constructors for buffer components now require a types.Manager, giving them parity with other components: buffer.New(conf Config, mgr types.Manager, log log.Modular, stats metrics.Type) (Type, error)