Components

Every Redpanda Connect pipeline has at least one input, an optional buffer, an output and any number of processors:

input:
  kafka:
    addresses: [ TODO ]
    topics: [ foo, bar ]
    consumer_group: foogroup

buffer:
  type: none

pipeline:
  processors:
  - mapping: |
      message = this
      meta.link_count = links.length()

output:
  aws_s3:
    bucket: TODO
    path: '${! meta("kafka_topic") }/${! json("message.id") }.json'

These are the main components within Redpanda Connect and they provide the majority of useful behavior.

Observability components

There are also the observability components http, logger, metrics, and tracing, which allow you to specify how Redpanda Connect exposes observability data:

http:
  address: 0.0.0.0:4195
  enabled: true
  debug_endpoints: false

logger:
  format: json
  level: WARN

metrics:
  statsd:
    address: localhost:8125
    flush_period: 100ms

tracer:
  jaeger:
    agent_address: localhost:6831

Resource components

Finally, there are caches and rate limits. These are components that are referenced by core components and can be shared.

input:
  http_client: # This is an input
    url: TODO
    rate_limit: foo_ratelimit # This is a reference to a rate limit

pipeline:
  processors:
    - cache: # This is a processor
        resource: baz_cache # This is a reference to a cache
        operator: add
        key: '${! json("id") }'
        value: "x"
    - mapping: root = if errored() { deleted() }

rate_limit_resources:
  - label: foo_ratelimit
    local:
      count: 500
      interval: 1s

cache_resources:
  - label: baz_cache
    memcached:
      addresses: [ localhost:11211 ]

It’s also possible to configure inputs, outputs and processors as resources which allows them to be reused throughout a configuration with the resource input, resource output and resource processor respectively.

For more information about any of these component types check out their sections: