Get Started with Redpanda Connect using rpk
This guide explains how to get started with Redpanda Connect using rpk
, the Redpanda command-line interface (CLI). You can also install and run rpk
in FIPS compliance mode.
Install
The rpk
CLI allows you to create and manage data pipelines with Redpanda Connect as well as interact with Redpanda clusters.
The rpk CLI also includes an rpk connect plugin, which manages installations and upgrades of Redpanda Connect. This plugin is automatically installed when you first run rpk connect commands, unless you run rpk connect --version , which prompts you to install the plugin.
|
Also interacting with a Redpanda cluster?
If you want to use rpk
to also communicate with a Redpanda cluster, ensure the version of rpk
that you install matches the version of Redpanda running in your cluster.
Linux
To install, or update to, the latest version of rpk
for Linux, run:
-
amd64
-
arm64
curl -LO https://github.com/redpanda-data/redpanda/releases/latest/download/rpk-linux-amd64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-linux-amd64.zip -d ~/.local/bin/
curl -LO https://github.com/redpanda-data/redpanda/releases/latest/download/rpk-linux-arm64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-linux-arm64.zip -d ~/.local/bin/
You can use rpk on Windows only with WSL. However, commands that require Redpanda to be installed on your machine are not supported, such as rpk container commands, rpk iotune , and rpk redpanda commands.
|
To install, or update to, a version other than the latest, run:
-
amd64
-
arm64
curl -LO https://github.com/redpanda-data/redpanda/releases/download/v<version>/rpk-linux-amd64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-linux-amd64.zip -d ~/.local/bin/
curl -LO https://github.com/redpanda-data/redpanda/releases/download/v<version>/rpk-linux-arm64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-linux-arm64.zip -d ~/.local/bin/
FIPS compliance
This feature requires an enterprise license. You can either upgrade to an Enterprise Edition license, or generate a trial license key that’s valid for 30 days. |
To install rpk
to run the latest version of Redpanda Connect in FIPS-compliant mode, you must install the redpanda-rpk-fips
and redpanda-connect-fips
packages. Both packages are built using the Microsoft GoLang compiler and the Microsoft’s Go Crypto OpenSSL package, which uses the FIPS-approved version of OpenSSL.
The packages for FIPS-compliant rpk (redpanda-rpk-fips ) and Redpanda rpk (redpanda-rpk ) are mutually exclusive, and so cannot be installed in the same environment.
|
-
RHEL
-
Debian/Ubuntu
-
To make sure your distribution is up to date, run:
sudo dnf upgrade
bash -
Add
redpanda
to yourdnf
list of repositories.curl -1sLf 'https://dl.redpanda.com/nzc4ZYQK3WRGd9sy/redpanda/cfg/setup/bash.rpm.sh' | \ sudo -E bash
bash -
Install Redpanda packages for FIPS compliance.
sudo dnf install -y redpanda-rpk-fips redpanda-connect-fips
bash -
Verify your installation.
rpk connect --version
bash
To keep up-to-date with Redpanda Connect releases, run the following command:
sudo dnf update
-
To make sure your distribution is up to date, run:
sudo apt upgrade
bash -
Add
redpanda
to yourapt
list of repositories.curl -1sLf 'https://dl.redpanda.com/nzc4ZYQK3WRGd9sy/redpanda/cfg/setup/bash.deb.sh' | sudo -E bash
bash -
Install Redpanda packages for FIPS compliance.
sudo apt install -y redpanda-rpk-fips redpanda-connect-fips
bash -
Verify your installation.
rpk connect --version
bash
To keep up-to-date with the Redpanda Connect releases, run the following command:
sudo apt update
MacOS
-
Homebrew
-
Manual Download
-
If you don’t have Homebrew installed, install it.
-
To install or update
rpk
, run:brew install redpanda-data/tap/redpanda
bash
To install or update rpk
through a manual download, choose the option for your system architecture. For example, if you have an M1 or newer chip, select Apple Silicon.
-
Intel macOS
-
Apple Silicon
To install, or update to, the latest version of rpk
for Intel macOS, run:
curl -LO https://github.com/redpanda-data/redpanda/releases/latest/download/rpk-darwin-amd64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-darwin-amd64.zip -d ~/.local/bin/
To install, or update to, a version other than the latest, run:
curl -LO https://github.com/redpanda-data/redpanda/releases/download/v<version>/rpk-darwin-amd64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-darwin-amd64.zip -d ~/.local/bin/
To install, or update to, the latest version of rpk
for Apple Silicon, run:
curl -LO https://github.com/redpanda-data/redpanda/releases/latest/download/rpk-darwin-arm64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-darwin-arm64.zip -d ~/.local/bin/
To install, or update to, a version other than the latest, run:
curl -LO https://github.com/redpanda-data/redpanda/releases/download/v<version>/rpk-darwin-arm64.zip &&
mkdir -p ~/.local/bin &&
export PATH="~/.local/bin:$PATH" &&
unzip rpk-darwin-arm64.zip -d ~/.local/bin/
Run a pipeline
A Redpanda Connect stream pipeline is configured with a single config file, you can generate a fresh one with:
rpk connect create > connect.yaml
This command may take a few seconds to run. If this is the first rpk connect command you have run, the rpk connect plugin is automatically installed.
|
For Docker installations:
docker run --rm docker.redpanda.com/redpandadata/connect create > ./connect.yaml
The main sections that make up a config are input
, pipeline
and output
. When you generate a fresh config it’ll simply pipe stdin
to stdout
like this:
input:
stdin: {}
pipeline:
processors: []
output:
stdout: {}
Eventually we’ll want to configure a more useful input and output, but for now this is useful for quickly testing processors. You can execute this config with:
rpk connect run connect.yaml
For Docker installations:
docker run --rm -it -v $(pwd)/connect.yaml:/connect.yaml docker.redpanda.com/redpandadata/connect run
Anything you write to stdin will get written unchanged to stdout, cool! Resist the temptation to play with this for hours, there’s more stuff to try out.
Next, let’s add some processing steps in order to mutate messages. The most powerful one is the mapping
processor which allows us to perform mappings, let’s add a mapping to uppercase our messages:
input:
stdin: {}
pipeline:
processors:
- mapping: root = content().uppercase()
output:
stdout: {}
Now your messages should come out in all caps.
You can add as many processing steps as you like, and since processors are what make Redpanda Connect powerful they are worth experimenting with. Let’s create a more advanced pipeline that works with JSON documents:
input:
stdin: {}
pipeline:
processors:
- sleep:
duration: 500ms
- mapping: |
root.doc = this
root.first_name = this.names.index(0).uppercase()
root.last_name = this.names.index(-1).hash("sha256").encode("base64")
output:
stdout: {}
First, we sleep for 500 milliseconds just to keep the suspense going. Next, we restructure our input JSON document by nesting it within a field doc
, we map the upper-cased first element of names
to a new field first_name
. Finally, we map the hashed and base64 encoded value of the last element of names
to a new field last_name
.
Try running that config with some sample documents:
echo '{"id":"1","names":["celine","dion"]}
{"id":"2","names":["chad","robert","kroeger"]}' | rpk connect run connect.yaml
For Docker installations:
echo '{"id":"1","names":["celine","dion"]}
{"id":"2","names":["chad","robert","kroeger"]}' | docker run --rm -i -v $(pwd)/connect.yaml:/connect.yaml docker.redpanda.com/redpandadata/connect run
You should see this output in the logs:
{"doc":{"id":"1","names":["celine","dion"]},"first_name":"CELINE","last_name":"1VvPgCW9sityz5XAMGdI2BTA7/44Wb3cANKxqhiCo50="}
{"doc":{"id":"2","names":["chad","robert","kroeger"]},"first_name":"CHAD","last_name":"uXXg5wCKPjpyj/qbivPbD9H9CZ5DH/F0Q1Twytnt2hQ="}
See also: