Lambda

You can use either of the following distributions to deploy Redpanda Connect as an AWS Lambda function:

  • The redpanda-connect-lambda distribution is specifically tailored for deployment as an AWS Lambda function on the go1.x runtime, which runs Amazon Linux on the x86_64 architecture.

  • The redpanda-connect-lambda-al2 distribution supports the provided.al2 runtime, which runs Amazon Linux 2 on either the x86_64 or arm64 architecture.

Configuration

The AWS Lambda version of Redpanda Connect uses the same configuration format as a regular instance. Choose from the following two options:

  1. Inline via the BENTHOS_CONFIG environment variable (YAML format).

  2. Via the filesystem using a layer, extension, or container image. By default, the redpanda-connect-lambda distribution looks for a valid configuration file in the locations listed below. Alternatively, you can set the configuration file path explicity by passing a BENTHOS_CONFIG_PATH environment variable.

    • ./benthos.yaml

    • ./config.yaml

    • /benthos.yaml

    • /etc/benthos/config.yaml

    • /etc/benthos.yaml

Also, the http, input and buffer sections are ignored as the service wide HTTP server is not used, and messages are inserted via function invocations.

If the output section is omitted in your config then the result of the processing pipeline is returned back to the caller, otherwise the resulting data is sent to the output destination.

Run with an output

The flow of a Redpanda Connect Lambda function with an output configured looks like this:

                    redpanda-connect-lambda
           +------------------------------+
           |                              |
       -------> Processors ----> Output -----> Somewhere
invoke     |                              |        |
       <-------------------------------------------/
           |         <Ack/Noack>          |
           |                              |
           +------------------------------+

Where the call will block until the output target has confirmed receipt of the resulting payload. When the message is successfully propagated a JSON payload is returned of the form {"message":"request successful"}, otherwise an error is returned containing the reason for the failure.

Run without an output

The flow when an output is not configured looks like this:

               redpanda-connect-lambda
           +--------------------+
           |                    |
       -------> Processors --\  |
invoke     |                 |  |
       <---------------------/  |
           |     <Result>       |
           |                    |
           +--------------------+

Where the function returns the result of processing directly back to the caller. The format of the result differs depending on the number of batches and messages of a batch that resulted from the invocation:

  • Single message of a single batch: {} (JSON object)

  • Multiple messages of a single batch: [{},{}] (Array of JSON objects)

  • Multiple batches: [[{},{}],[{}]] (Array of arrays of JSON objects, batches of size one are a single object array in this case)

Process errors

By default, the Redpanda Connect Lambda handler fails if messages encounter an uncaught error during execution.

Run a combination

It’s possible to configure pipelines that send messages to third party destinations and also return a result back to the caller. This is done by configuring an output block and including an output of the type sync_response.

For example, if we wished for our Lambda function to send a payload to Kafka and also return the same payload back to the caller we could use a broker:

output:
  broker:
    pattern: fan_out
    outputs:
    - kafka:
        addresses:
        - todo:9092
        client_id: benthos_serverless
        topic: example_topic
    - sync_response: {}

Upload to AWS

go1.x on x86_64

Grab an archive labelled redpanda-connect-lambda from the releases page page and then create your function:

LAMBDA_ENV=`cat yourconfig.yaml | jq -csR {Variables:{BENTHOS_CONFIG:.}}`
aws lambda create-function \
  --runtime go1.x \
  --handler redpanda-connect-lambda \
  --role benthos-example-role \
  --zip-file fileb://redpanda-connect-lambda_4.68.0_linux_amd64.zip \
  --environment "$LAMBDA_ENV" \
  --function-name benthos-example

provided.al2 on arm64

Grab an archive labelled redpanda-connect-lambda-al2 for arm64 from the releases page page and then create your function (AWS CLI v2 only):

LAMBDA_ENV=`cat yourconfig.yaml | jq -csR {Variables:{BENTHOS_CONFIG:.}}`
aws lambda create-function \
  --runtime provided.al2 \
  --architectures arm64 \
  --handler not.used.for.provided.al2.runtime \
  --role benthos-example-role \
  --zip-file fileb://redpanda-connect-lambda-al2_4.68.0_linux_arm64.zip \
  --environment "$LAMBDA_ENV" \
  --function-name benthos-example

Note that you can also run redpanda-connect-lambda-al2 on x86_64, just use the amd64 zip instead.

Invoke

aws lambda invoke \
  --function-name benthos-example \
  --payload '{"your":"document"}' \
  out.txt && cat out.txt && rm out.txt

Build

You can build and archive the function yourself with:

go build github.com/redpanda-data/connect/v4/cmd/serverless/connect-lambda
zip connect-lambda.zip connect-lambda