Components

Every Redpanda Connect pipeline has at least one input, an optional buffer, an output and any number of processors:

input:
  kafka:
    addresses: [ TODO ]
    topics: [ foo, bar ]
    consumer_group: foogroup

buffer:
  type: none

pipeline:
  processors:
  - mapping: |
      message = this
      meta.link_count = links.length()

output:
  aws_s3:
    bucket: TODO
    path: '${! meta("kafka_topic") }/${! json("message.id") }.json'

These are the main components within Redpanda Connect and they provide the majority of useful behavior.

Observability components

There are also the observability components: http, logger, metrics, and tracing, which allow you to specify how Redpanda Connect exposes observability data.

http:
  address: 0.0.0.0:4195
  enabled: true
  debug_endpoints: false

logger:
  format: json
  level: WARN

metrics:
  statsd:
    address: localhost:8125
    flush_period: 100ms

tracer:
  jaeger:
    agent_address: localhost:6831

Resource components

Finally, there are caches and rate limits. These are components that are referenced by core components and can be shared.

input:
  http_client: # This is an input
    url: TODO
    rate_limit: foo_ratelimit # This is a reference to a rate limit

pipeline:
  processors:
    - cache: # This is a processor
        resource: baz_cache # This is a reference to a cache
        operator: add
        key: '${! json("id") }'
        value: "x"
    - mapping: root = if errored() { deleted() }

rate_limit_resources:
  - label: foo_ratelimit
    local:
      count: 500
      interval: 1s

cache_resources:
  - label: baz_cache
    memcached:
      addresses: [ localhost:11211 ]

It’s also possible to configure inputs, outputs and processors as resources which allows them to be reused throughout a configuration with the resource input, resource output and resource processor respectively.

For more information about any of these component types check out their sections: