# Labs - Full Markdown Export > This file contains all Labs documentation pages in markdown format for AI agent consumption. > Generated from 32 pages on 2026-04-10T19:52:50.870Z > Component: redpanda-labs | Version: > Site: https://docs.redpanda.com ## About This Export This export includes the **latest version** () of the Labs documentation. ### AI-Friendly Documentation Formats We provide multiple formats optimized for AI consumption: - **https://docs.redpanda.com/llms.txt**: Curated overview of all Redpanda documentation - **https://docs.redpanda.com/llms-full.txt**: Complete documentation export with all components - **https://docs.redpanda.com/redpanda-labs-full.txt**: This file - Labs documentation only - **Individual markdown pages**: Each HTML page has a corresponding .md file --- # Page 1: Build a LangGraph Agent with the Redpanda AI Gateway **URL**: https://docs.redpanda.com/redpanda-labs/ai-agents/langchain-agent.md --- # Build a LangGraph Agent with the Redpanda AI Gateway --- title: Build a LangGraph Agent with the Redpanda AI Gateway latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: langchain-agent page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: langchain-agent.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/ai-agents/pages/langchain-agent.adoc description: Build a ReAct agent using LangGraph that connects to the Redpanda AI Gateway for unified LLM access and MCP tool calling. page-topic-type: lab learning-objective-1: Build and run a LangGraph ReAct agent that connects to the Redpanda AI Gateway learning-objective-2: Authenticate with the AI Gateway using the OIDC client_credentials flow learning-objective-3: Configure LangGraph to route LLM requests through the AI Gateway's OpenAI-compatible interface page-git-created-date: "2026-03-06" page-git-modified-date: "2026-03-06" --- Build a Python agent using [LangGraph](https://langchain-ai.github.io/langgraph/) that connects to the Redpanda AI Gateway for unified LLM access and MCP tool calling, with optional [LangSmith](https://www.langchain.com/langsmith) tracing. After completing this lab, you will be able to: - Build and run a LangGraph ReAct agent that connects to the Redpanda AI Gateway - Authenticate with the AI Gateway using the OIDC client\_credentials flow - Configure LangGraph to route LLM requests through the AI Gateway’s OpenAI-compatible interface ## [](#what-youll-explore)What you’ll explore - **AI Gateway as a unified LLM interface**: The gateway provides an OpenAI-compatible API that routes to any upstream model provider (such as Google Gemini), handling provider-specific authentication and request translation. - **OIDC authentication**: Authenticate with the gateway using a Redpanda Cloud service account and the OAuth 2.0 `client_credentials` grant. - **Dynamic MCP tool discovery**: Use the gateway’s MCP endpoint to discover and call tools at runtime, without hard-coding tool definitions. ## [](#prerequisites)Prerequisites - Python 3.12 or later - [Poetry](https://python-poetry.org/) for dependency management - A Redpanda Cloud account with: - A cluster that has the AI Gateway enabled - A service account with permissions to access the cluster - (Optional) A [LangSmith](https://www.langchain.com/langsmith) API key for tracing ## [](#get-the-lab-files)Get the lab files Clone the repository and navigate to the lab directory: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git cd redpanda-labs/ai-agents/langchain-agent ``` ## [](#set-up-the-project)Set up the project 1. Install the project dependencies: ```bash poetry install ``` 2. Copy the example environment file and fill in your credentials: ```bash cp .env.example .env.local ``` 3. Edit `.env.local` with your Redpanda Cloud service account credentials: ```env REDPANDA_CLIENT_ID= REDPANDA_CLIENT_SECRET= REDPANDA_GATEWAY_ID= REDPANDA_GATEWAY_URL= ``` You can find these values in the [Redpanda Cloud console](https://cloud.redpanda.com). ## [](#run-the-agent)Run the agent Start the agent: ```bash poetry run redpanda-agent ``` The agent opens a terminal UI where you can interact with it. The agent authenticates with the AI Gateway using your service account credentials, discovers available MCP tools, and displays a chat prompt. From there, your requests are routed through the gateway to the configured LLM provider. ## [](#explore-the-lab)Explore the lab Key technical components include the agent architecture, authentication flow, and dynamic tool discovery. ### [](#architecture)Architecture The agent uses the following architecture: ```text Python Agent (LangGraph) | |-- OIDC client_credentials flow --> Redpanda Cloud IdP --> Bearer token | |-- ChatOpenAI (base_url=/v1) | | | +-- OpenAI-compatible API via gateway | |-- MCP tools via gateway (/mcp/) | | | +-- tool_search --> discovers available tools dynamically | +-- AgentMiddleware --> injects and executes discovered tools | |-- LangSmith tracing (optional) ``` ### [](#how-the-ai-gateway-works)How the AI Gateway works The AI Gateway is a multi-tenant platform where each user configures their own gateway instance. The gateway translates upstream provider responses into the OpenAI chat completions format, so clients interact with a standard interface regardless of the underlying model provider. Every request to the gateway requires two headers: | Header | Value | Purpose | | --- | --- | --- | | Authorization | Bearer | OIDC authentication | | rp-aigw-id | | Identifies the gateway instance | Because the gateway speaks the OpenAI format, you use `ChatOpenAI` from `langchain-openai`: ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( base_url=f"{gateway_url}/v1", api_key="not-used", # Auth is via OIDC Bearer token model="google/gemini-3-flash-preview", default_headers={ "Authorization": f"Bearer {token}", "rp-aigw-id": gateway_id, }, ) ``` ### [](#oidc-authentication)OIDC authentication Authentication is against the **Redpanda Cloud OIDC identity provider**, not the gateway itself. The gateway validates the resulting tokens. The `GatewayAuth` class uses OIDC discovery to resolve the token endpoint automatically from the issuer: ```text https://auth.prd.cloud.redpanda.com/.well-known/openid-configuration ``` It fetches this discovery document on the first token request, then uses a `client_credentials` grant with the audience `cloudv2-production.redpanda.cloud`: ```python auth = GatewayAuth() # uses REDPANDA_ISSUER env var or default token = await auth.get_token() ``` > 📝 **NOTE** > > The AI agent is responsible for refreshing tokens before they expire. `GatewayAuth` handles this automatically with a 30-second buffer before expiry. ### [](#dynamic-mcp-tool-discovery)Dynamic MCP tool discovery The gateway’s MCP endpoint uses a two-level tool discovery pattern: 1. `list_tools()` returns a small set of static tools, including `tool_search`. 2. Calling `tool_search` discovers additional tools available on the gateway (such as `redpanda-docs:ask_redpanda_question`). 3. The set of discovered tools can change at any time because they are not static. The agent uses LangChain’s `AgentMiddleware` to inject dynamically discovered tools at runtime: ```python from langchain.agents import create_agent graph = create_agent( model=llm, tools=static_tools, # For example, [tool_search] middleware=[middleware], # MCPGatewayMiddleware ) ``` The `MCPGatewayMiddleware` provides two hooks: - `awrap_model_call`: Injects discovered tools into the model’s tool list before each LLM call. - `awrap_tool_call`: Intercepts calls to discovered tools and executes them through the MCP `ClientSession.call_tool()` method. > 📝 **NOTE** > > MCP tool names like `redpanda-docs:ask_redpanda_question` contain colons, which are not valid in OpenAI tool names. The middleware sanitizes names by replacing invalid characters with hyphens and maintains a mapping to the original MCP name. ### [](#mcp-transport)MCP transport Use `streamable_http` as the transport: ```python client = MultiServerMCPClient({ "gateway": { "transport": "streamable_http", "url": f"{gateway_url}/mcp/", "headers": { ... }, }, }) ``` ### [](#enable-langsmith-tracing-optional)Enable LangSmith tracing (optional) To enable tracing, set these environment variables in your `.env.local` file: ```env LANGSMITH_API_KEY= LANGSMITH_PROJECT=redpanda-agent LANGSMITH_TRACING=true ``` LangGraph auto-detects these variables and traces all LLM and tool calls. ### [](#project-structure)Project structure | File | Purpose | | --- | --- | | src/agent/auth.py | OIDC token management using authlib | | src/agent/gateway.py | ChatOpenAI configured for the AI Gateway | | src/agent/tools.py | MCP tool loading and AgentMiddleware for dynamic discovery | | src/agent/graph.py | LangGraph agent graph | | src/agent/main.py | Terminal UI entry point | ### [](#key-dependencies)Key dependencies | Package | Purpose | | --- | --- | | langchain-openai | Gateway uses OpenAI format regardless of upstream provider | | langchain-mcp-adapters | MCP client and tool conversion | | authlib | OIDC client_credentials with token caching | ## [](#clean-up)Clean up Stop the agent by pressing Ctrl+C in the terminal. ## [](#next-steps)Next steps - [Learn more about the AI Gateway](https://docs.redpanda.com/redpanda-cloud/ai-agents/ai-gateway/) --- # Page 2: Build a Chat Room Application with Redpanda Cloud and Golang **URL**: https://docs.redpanda.com/redpanda-labs/clients/cloud-go.md --- # Build a Chat Room Application with Redpanda Cloud and Golang --- title: Build a Chat Room Application with Redpanda Cloud and Golang latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cloud-go page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cloud-go.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/cloud-go.adoc description: Create a basic chat room application with Redpanda Cloud and Kafka clients developed with franz-go. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda Cloud and Kafka clients developed with [franz-go](https://github.com/twmb/franz-go). This example shows you how to: - Write a client application in Go to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda Cloud. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Download and install Go from [go.dev](https://go.dev/doc/install). - Complete the [Redpanda Cloud Quickstart](https://docs.redpanda.com/current/get-started/quick-start-cloud/) before continuing. This example expands on the quickstart. > 📝 **NOTE** > > Redpanda Cloud uses TLS certificates signed by [Let’s Encrypt](https://letsencrypt.org/). Most programming languages will load their root certificate authority (CA), `ISRG Root X1`, by default so you shouldn’t need to provide a custom CA certificate. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/cloud/go ``` 3. Open the Golang files and replace the placeholders wrapped in angle brackets (`<>`) with the same values that you used in the Redpanda Cloud Quickstart. 4. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash rpk topic list --tls-enabled ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 5. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash rpk topic create chat-room --tls-enabled ``` Output: TOPIC STATUS chat-room OK 6. Open at least two terminals, and for each terminal: 7. Run the client application: ```none go run . ``` 8. When prompted with `Enter user name:`, enter a unique name for the chat room. 9. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`admin.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/go/admin.go): Checks whether the `chat-room` topic exists and creates it if not. - [`producer.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/go/producer.go): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`consumer.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/go/consumer.go): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. - [`main.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/go/main.go): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 3: Build a Chat Room Application with Redpanda Cloud and Java **URL**: https://docs.redpanda.com/redpanda-labs/clients/cloud-java.md --- # Build a Chat Room Application with Redpanda Cloud and Java --- title: Build a Chat Room Application with Redpanda Cloud and Java latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cloud-java page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cloud-java.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/cloud-java.adoc description: Create a basic chat room application with Redpanda Cloud and Kafka Java clients. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda Cloud and [Kafka Java clients](https://central.sonatype.com/artifact/org.apache.kafka/kafka-clients). This example shows you how to: - Write a client application in Java to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda Cloud. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Complete the [Redpanda Cloud Quickstart](https://docs.redpanda.com/current/get-started/quick-start-cloud/) before continuing. This example expands on the quickstart. - Install the following: - Java 11 or 17 (OpenJDK is recommended) - Maven ### Windows/Linux You can download OpenJDK from [Adoptium](https://adoptium.net/temurin/releases), and can follow the installation instructions for Maven on the [official Maven website](https://maven.apache.org/install.html). ### macOS Mac users with Homebrew installed can run the following commands to install these dependencies: ```bash brew install openjdk@11 maven ``` Make sure to follow any symlinking instructions in the Caveats output. When the prerequisites are installed, the following commands should print the version of both Java and Maven: ```bash java --version mvn --version ``` > 📝 **NOTE** > > Redpanda Cloud uses TLS certificates signed by [Let’s Encrypt](https://letsencrypt.org/). Most programming languages will load their root certificate authority (CA), `ISRG Root X1`, by default so you shouldn’t need to provide a custom CA certificate. ## [](#run-the-lab)Run the lab Compile the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/cloud/java ``` 3. Open the Java files and replace the placeholders wrapped in angle brackets (`<>`) with the same values that you used in the Redpanda Cloud Quickstart. 4. Install the dependencies by building the project: ```bash mvn package ``` The output is verbose, but you should see a successful build message: \[INFO\] BUILD SUCCESS 5. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash rpk topic list --tls-enabled ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 6. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash rpk topic create chat-room --tls-enabled ``` Output: TOPIC STATUS chat-room OK 7. From `chat-room/cloud/java`, compile the client application: ```bash mvn compile ``` 8. Open at least two terminals, and for each terminal: 1. Run the client application: ```bash mvn exec:java -Dexec.mainClass="com.example.Main" ``` 2. When prompted with `Enter user name:`, enter a unique name for the chat room. 9. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`src/main/java/com/example/Admin.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/java/src/main/java/com/example/Admin.java): Checks whether the `chat-room` topic exists and creates it if not. - [`src/main/java/com/example/ChatProducer.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/java/src/main/java/com/example/ChatProducer.java): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`src/main/java/com/example/ChatConsumer.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/java/src/main/java/com/example/ChatConsumer.java): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. - [`src/main/java/com/example/Main.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/java/src/main/java/com/example/Main.java): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 4: Build a Chat Room Application with Redpanda Cloud and Node.js **URL**: https://docs.redpanda.com/redpanda-labs/clients/cloud-nodejs.md --- # Build a Chat Room Application with Redpanda Cloud and Node.js --- title: Build a Chat Room Application with Redpanda Cloud and Node.js latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cloud-nodejs page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cloud-nodejs.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/cloud-nodejs.adoc description: Create a basic chat room application with Redpanda Cloud and Kafka clients developed with kafkajs. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda Cloud and Kafka clients developed with [kafkajs](https://kafka.js.org/). This tutorial describes how to: - Start a Redpanda cluster to store and stream chat room messages. - Write a client application in TypeScript to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda Cloud. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - [Install Node.js for your platform](https://nodejs.org/en/download/package-manager/). - Complete the [Redpanda Cloud Quickstart](https://docs.redpanda.com/current/get-started/quick-start-cloud/) before continuing. This tutorial expands on the quickstart. > 📝 **NOTE** > > Redpanda Cloud uses TLS certificates signed by [Let’s Encrypt](https://letsencrypt.org/). Most programming languages will load their root certificate authority (CA), `ISRG Root X1`, by default so you shouldn’t need to provide a custom CA certificate. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/cloud/nodejs ``` 3. Open the TypeScript files in the `src/` directory (`admin.ts`, `producer.ts`, `consumer.ts`, and `index.ts`) and update the Redpanda connection information: - Replace the placeholders wrapped in angle brackets (`<>`) with your Redpanda Cloud connection details (such as `bootstrap-server-address`, `username`, and `password`). You can find these values in your Redpanda Cloud console. Make sure to save your changes before running the application. 4. From the `clients/chat-room/cloud/nodejs` directory, install the required dependencies: ```bash npm i ``` 5. From the `clients/chat-room/cloud/nodejs` directory, build the client application: ```bash node_modules/typescript/bin/tsc src/index.ts ``` 6. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash rpk topic list --tls-enabled ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 7. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash rpk topic create chat-room --tls-enabled ``` Output: TOPIC STATUS chat-room OK 8. Open at least two terminals, and for each terminal: 1. Run the client application: ```bash node src/index.js ``` 2. When prompted with `Enter user name:`, enter a unique name for the chat room. 9. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`sr/admin.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/nodejs/src/admin.ts): Checks whether the `chat-room` topic exists and creates it if not. > 📝 **NOTE** > > The broker settings in this code are from the Redpanda Quickstart, where the external port for broker `redpanda` is set to port 19092. - [`src/producer.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/nodejs/src/producer.ts): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`src/consumer.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/nodejs/src/consumer.ts): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. > 📝 **NOTE** > > Because the `eachMessage()` function automatically commits on a heartbeat interval, there is no `commit()` method or auto-commit configuration in the code. - [`src/index.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/nodejs/src/index.ts): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 5: Build a Chat Room Application with Redpanda Cloud and Python **URL**: https://docs.redpanda.com/redpanda-labs/clients/cloud-python.md --- # Build a Chat Room Application with Redpanda Cloud and Python --- title: Build a Chat Room Application with Redpanda Cloud and Python latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cloud-python page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cloud-python.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/cloud-python.adoc description: Create a basic chat room application with Redpanda Cloud and Kafka clients developed with kafka-python. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda Cloud and Kafka clients developed with [kafka-python-ng](https://kafka-python.readthedocs.io/en/master/). This example shows you how to: - Write a client application in Python to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda Cloud. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Download and install Python 3 from [python.org](https://www.python.org/downloads). - Complete the [Redpanda Cloud Quickstart](https://docs.redpanda.com/current/get-started/quick-start-cloud/) before continuing. This example expands on the quickstart. > 📝 **NOTE** > > Redpanda Cloud uses TLS certificates signed by [Let’s Encrypt](https://letsencrypt.org/). Most programming languages will load their root certificate authority (CA), `ISRG Root X1`, by default so you shouldn’t need to provide a custom CA certificate. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/cloud/python ``` 3. Create a virtual environment: ```bash python3 -m venv .env source .env/bin/activate ``` 4. Install the required dependencies: ```bash pip3 install --upgrade pip pip3 install -r requirements.txt ``` 5. Open the Python files and replace the placeholders wrapped in angle brackets (`<>`) with the same values that you used in the Redpanda Cloud Quickstart. 6. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash rpk topic list --tls-enabled ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 7. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash rpk topic create chat-room --tls-enabled ``` Output: TOPIC STATUS chat-room OK 8. Open at least two terminals, and for each terminal: 9. Run the client application: ```none python app.py ``` 10. When prompted with `Enter user name:`, enter a unique name for the chat room. 11. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`admin.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/python/admin.py): Checks whether the `chat-room` topic exists and creates it if not. - [`producer.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/python/producer.py): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`consumer.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/python/consumer.py): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. - [`app.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/python/app.py): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 6: Build a Chat Room Application with Redpanda Cloud and Rust **URL**: https://docs.redpanda.com/redpanda-labs/clients/cloud-rust.md --- # Build a Chat Room Application with Redpanda Cloud and Rust --- title: Build a Chat Room Application with Redpanda Cloud and Rust latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cloud-rust page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cloud-rust.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/cloud-rust.adoc description: Create a basic chat room application with Redpanda Cloud and Kafka clients developed with rust-rdkafka. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda Cloud and Kafka clients developed with [rust-rdkafka](https://github.com/twmb/fede1024/rust-rdkafka). This example shows you how to: - Write a client application in Rust to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda Cloud. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Download and install Rust from [rust-lang.org](https://rust-lang.org/tools/install). - Complete the [Redpanda Cloud Quickstart](https://docs.redpanda.com/current/get-started/quick-start-cloud/) before continuing. This example expands on the quickstart. > 📝 **NOTE** > > Redpanda Cloud uses TLS certificates signed by [Let’s Encrypt](https://letsencrypt.org/). Most programming languages will load their root certificate authority (CA), `ISRG Root X1`, by default so you shouldn’t need to provide a custom CA certificate. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/cloud/rust ``` 3. Rename the `.env.example` file to `.env` and replace the placeholders wrapped in angle brackets (`<>`) with the same values that you used in the Redpanda Cloud Quickstart. 4. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash rpk topic list --tls-enabled ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 5. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash rpk topic create chat-room --tls-enabled ``` Output: TOPIC STATUS chat-room OK 6. Open at least two terminals, and for each terminal: 7. Run the client application: ```none cargo run ``` 8. When prompted with `Enter user name:`, enter a unique name for the chat room. 9. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`admin.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/rust/admin.rs): Checks whether the `chat-room` topic exists and creates it if not. - [`producer.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/rust/producer.rs): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`consumer.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/rust/consumer.rs): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated timestamp appended to the group ID. This way, each time you run the application, you see all previous messages. - [`main.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/cloud/rust/main.rs): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 7: Build a Chat Room Application with Redpanda and Golang **URL**: https://docs.redpanda.com/redpanda-labs/clients/docker-go.md --- # Build a Chat Room Application with Redpanda and Golang --- title: Build a Chat Room Application with Redpanda and Golang latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: docker-go page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: docker-go.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/docker-go.adoc description: Create a basic chat room application with Redpanda and Kafka clients developed with kafkajs. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda and Kafka clients developed with [franz-go](https://github.com/twmb/franz-go). This example shows you how to: - Write a client application in Go to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Download and install Go from [go.dev](https://go.dev/doc/install). - Complete the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/) before continuing. This example expands on the quickstart. You can choose to run either one or three brokers. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/docker/go ``` 3. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash docker exec -it redpanda-0 rpk topic list ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 4. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash docker exec -it redpanda-0 rpk topic create chat-room ``` Output: TOPIC STATUS chat-room OK 5. Open at least two terminals, and for each terminal: 6. Run the client application: ```bash go run . ``` 7. When prompted with `Enter user name:`, enter a unique name for the chat room. 8. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`admin.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/go/admin.go): Checks whether the `chat-room` topic exists and creates it if not. - [`producer.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/go/producer.go): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`consumer.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/go/consumer.go): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. - [`main.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/go/main.go): The client application that creates the topic, producer, and consumer and implements the chat logic. > 📝 **NOTE** > > The broker settings in this code are from the Redpanda Quickstart, where the external port for broker `redpanda` is set to port 19092. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 8: Build a Chat Room Application with Redpanda and Java **URL**: https://docs.redpanda.com/redpanda-labs/clients/docker-java.md --- # Build a Chat Room Application with Redpanda and Java --- title: Build a Chat Room Application with Redpanda and Java latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: docker-java page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: docker-java.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/docker-java.adoc description: Create a basic chat room application with Redpanda and Kafka Java clients. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda and [Kafka Java clients](https://central.sonatype.com/artifact/org.apache.kafka/kafka-clients). This example shows you how to: - Write a client application in Java to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Complete the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/) before continuing. This example expands on the quickstart. You can choose to run either one or three brokers. - Install the following: - Java 11 or 17 (OpenJDK is recommended) - Maven ### Windows/Linux You can download OpenJDK from [Adoptium](https://adoptium.net/temurin/releases), and can follow the installation instructions for Maven on the [official Maven website](https://maven.apache.org/install.html). ### macOS Mac users with Homebrew installed can run the following commands to install these dependencies: ```bash brew install openjdk@11 maven ``` Make sure to follow any symlinking instructions in the Caveats output. When the prerequisites are installed, the following commands should print the version of both Java and Maven: ```bash java --version mvn --version ``` ## [](#run-the-lab)Run the lab Compile the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/docker/java ``` 3. Install the dependencies by building the project: ```bash mvn package ``` The output is verbose, but you should see a successful build message: \[INFO\] BUILD SUCCESS 4. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash docker exec -it redpanda-0 rpk topic list ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 5. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash docker exec -it redpanda-0 rpk topic create chat-room ``` Output: TOPIC STATUS chat-room OK 6. From `chat-room/docker/java`, compile the client application: ```bash mvn compile ``` 7. Open at least two terminals, and for each terminal: 1. Run the client application: ```bash mvn exec:java -Dexec.mainClass="com.example.Main" ``` 2. When prompted with `Enter user name:`, enter a unique name for the chat room. 8. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`src/main/java/com/example/Admin.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/java/src/main/java/com/example/Admin.java): Checks whether the `chat-room` topic exists and creates it if not. - [`src/main/java/com/example/ChatProducer.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/java/src/main/java/com/example/ChatProducer.java): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`src/main/java/com/example/ChatConsumer.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/java/src/main/java/com/example/ChatConsumer.java): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. - [`src/main/java/com/example/Main.java`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/java/src/main/java/com/example/Main.java): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 9: Build a Chat Room Application with Redpanda and Node.js **URL**: https://docs.redpanda.com/redpanda-labs/clients/docker-nodejs.md --- # Build a Chat Room Application with Redpanda and Node.js --- title: Build a Chat Room Application with Redpanda and Node.js latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: docker-nodejs page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: docker-nodejs.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/docker-nodejs.adoc description: Create a basic chat room application with Redpanda and Kafka clients developed with kafkajs. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda and Kafka clients developed with [kafkajs](https://kafka.js.org/). This example shows you how to: - Write a client application in TypeScript to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - [Install Node.js for your platform](https://nodejs.org/en/download/package-manager/). - Complete the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/). This example expands on the quickstart. You can choose to run either one or three brokers. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/docker/nodejs ``` 3. Install the required dependencies: ```bash npm i ``` 4. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash docker exec -it redpanda-0 rpk topic list ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 5. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash docker exec -it redpanda-0 rpk topic create chat-room ``` Output: TOPIC STATUS chat-room OK 6. Open at least two terminals, and for each terminal: 1. Run the client application: ```bash node src/index.js ``` 2. When prompted with `Enter user name:`, enter a unique name for the chat room. 7. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`sr/admin.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/nodejs/src/admin.ts): Checks whether the `chat-room` topic exists and creates it if not. > 📝 **NOTE** > > The broker settings in this code are from the Redpanda Quickstart, where the external port for broker `redpanda` is set to port 19092. - [`src/producer.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/nodejs/src/producer.ts): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`src/consumer.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/nodejs/src/consumer.ts): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. > 📝 **NOTE** > > Because the `eachMessage()` function automatically commits on a heartbeat interval, there is no `commit()` method or auto-commit configuration in the code. - [`src/index.ts`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/nodejs/src/index.ts): The client application that creates the topic, producer, and consumer and implements the chat logic. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 10: Build a Chat Room Application with Redpanda and Python **URL**: https://docs.redpanda.com/redpanda-labs/clients/docker-python.md --- # Build a Chat Room Application with Redpanda and Python --- title: Build a Chat Room Application with Redpanda and Python latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: docker-python page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: docker-python.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/docker-python.adoc description: Create a basic chat room application with Redpanda and Kafka clients developed with kafka-python. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda Cloud and Kafka clients developed with [kafka-python-ng](https://kafka-python.readthedocs.io/en/master/). This example shows you how to: - Write a client application in Python to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Download and install Python 3 from [python.org](https://www.python.org/downloads). - Complete the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/). This example expands on the quickstart. You can choose to run either one or three brokers. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/docker/python ``` 3. Create a virtual environment: ```bash python3 -m venv .env source .env/bin/activate ``` 4. Install the required dependencies: ```bash pip3 install --upgrade pip pip3 install -r requirements.txt ``` 5. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash docker exec -it redpanda-0 rpk topic list ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 6. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash docker exec -it redpanda-0 rpk topic create chat-room ``` Output: TOPIC STATUS chat-room OK 7. Open at least two terminals, and for each terminal: 8. Run the client application: ```none python app.py ``` 9. When prompted with `Enter user name:`, enter a unique name for the chat room. 10. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`admin.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/python/admin.py): Checks whether the `chat-room` topic exists and creates it if not. - [`producer.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/python/producer.py): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`consumer.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/python/consumer.py): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated UUID for the group ID. This way, each time you run the application, you see all previous messages. - [`app.py`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/python/app.py): The client application that creates the topic, producer, and consumer and implements the chat logic. > 📝 **NOTE** > > The broker settings in this code are from the Redpanda Quickstart, where the external port for broker `redpanda` is set to port 19092. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 11: Build a Chat Room Application with Redpanda and Rust **URL**: https://docs.redpanda.com/redpanda-labs/clients/docker-rust.md --- # Build a Chat Room Application with Redpanda and Rust --- title: Build a Chat Room Application with Redpanda and Rust latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: docker-rust page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: docker-rust.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/docker-rust.adoc description: Create a basic chat room application with Redpanda and Kafka clients developed with rust-rdkafka. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- Create a basic chat room application with Redpanda and Kafka clients developed with [rust-rdkafka](https://github.com/fede1024/rust-rdkafka). This example shows you how to: - Write a client application in Rust to produce and consume chat room messages. - Build and run multiple clients to exchange chat messages streamed through Redpanda. ![Demo of the application](../_images/chat-room.gif) ## [](#what-is-a-chat-room-application)What is a chat room application? A chat room application is software that enables users to engage in real-time textual communication with one another. These applications typically allow multiple users to join a chat room, where they can send messages and interact with others in a group conversation. Chat room applications often include features such as private messaging, user profiles, and notifications. Some popular chat room applications include Slack, Discord, and WhatsApp. ## [](#why-use-redpanda)Why use Redpanda? Redpanda offers several features that make it ideal for building a fast, scalable, and robust chat room application. - Scalability: Redpanda can scale horizontally and vertically to accommodate growing chat room usage over time. - Low-latency: Redpanda is designed for minimal latency to provide a smooth user experience and fast message delivery. - Fault tolerance: Redpanda is resilient to failures, thanks to its built-in replication and partitioning capabilities. This built-in resilience ensures that the chat room application continues to serve users even if individual brokers in the cluster experience downtime. - Durability: Redpanda persists messages on disk, maintaining chat history and allowing users to read previous conversations. ## [](#prerequisites)Prerequisites - Download and install Rust from [rustup.rs](https://rustup.rs). - Complete the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/) before continuing. This example expands on the quickstart. You can choose to run either one or three brokers. ## [](#run-the-lab)Run the lab Build the client chat application, run it from multiple client terminals, and chat between the clients. 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the example directory: ```bash cd clients/chat-room/docker/rust ``` 3. Verify that the `chat-room` topic exists in your cluster by listing all topics: ```bash docker exec -it redpanda-0 rpk topic list ``` Output: NAME PARTITIONS REPLICAS chat-room 1 1 4. If the topic doesn’t exist yet, use [`rpk`](https://docs.redpanda.com/current/get-started/rpk/) to create a `chat-room` topic: ```bash docker exec -it redpanda-0 rpk topic create chat-room ``` Output: TOPIC STATUS chat-room OK 5. Open at least two terminals, and for each terminal: 6. Run the client application: ```bash cargo run ``` 7. When prompted with `Enter user name:`, enter a unique name for the chat room. 8. Use the chat application: enter a message in a terminal, and verify that the message is received in the other terminals. For example: Enter user name: Alice Connected, press Ctrl+C to exit Alice: Hi, I'm Alice Bob: Hi Alice, I'm Bob, nice to meet you ## [](#files-in-the-example)Files in the example This example includes the following files: - [`admin.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/rust/src/admin.rs): Checks whether the `chat-room` topic exists and creates it if not. - [`producer.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/rust/src/producer.rs): A producer that sends strings entered by the user of the terminal to the `chat-room` topic. Messages are sent as JSON encoded strings. - [`consumer.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/rust/src/consumer.rs): A consumer that reads all messages from the `chat-room` topic and prints them to the console. You can start as many consumer groups as you like, but each group reads a message only once, which is why the example is using a generated timestamp appended to the group ID. This way, each time you run the application, you see all previous messages. - [`main.rs`](https://github.com/redpanda-data/redpanda-labs/blob/main/clients/chat-room/docker/rust/src/main.rs): The client application that creates the topic, producer, and consumer and implements the chat logic. > 📝 **NOTE** > > The broker settings in this code are from the Redpanda Quickstart, where the external port for broker `redpanda` is set to port 19092. ## [](#next-steps)Next steps This is a basic example of a chat room application. You can improve this application by implementing additional features and components, such as: - A user interface to make it more interactive and user-friendly. - A user registration and login system to authenticate users before they can access the chat room. - Rate limiting and other measures to prevent spamming and abuse in the chat room. ## [](#suggested-reading)Suggested reading For additional resources to help you build stream processing applications that can aggregate, join, and filter your data streams, see: - [Redpanda University](https://university.redpanda.com/) - [Redpanda Blog](https://redpanda.com/blog) - [Resources](https://redpanda.com/resources) --- # Page 12: Stream Stock Market Data from a CSV file Using Node.js **URL**: https://docs.redpanda.com/redpanda-labs/clients/stock-market-activity-nodejs.md --- # Stream Stock Market Data from a CSV file Using Node.js --- title: Stream Stock Market Data from a CSV file Using Node.js latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: stock-market-activity-nodejs page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: stock-market-activity-nodejs.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/stock-market-activity-nodejs.adoc description: Stream data from a CSV file into a Redpanda topic. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This lab demonstrates how to use a Node.js Kafka producer to stream data from a CSV file into a Redpanda topic. The script simulates real-time stock market activity by pushing JSON formatted messages into a topic. ```json {"Date":"10/22/2013","Close/Last":"$40.45","Volume":"8347540","Open":"$39.95","High":"$40.54","Low":"$39.80"} ``` This script allows you to loop through data continuously, reverse the order of data for different viewing perspectives, and manipulate date columns for time-series analysis. In this lab, you will: - Run the producer that streams data from a CSV file directly into a Redpanda topic. - Discover methods to alter the data stream, such as reversing the data sequence or looping through the data continuously for persistent simulations. - Adjust date fields dynamically to represent different time frames for analysis. ## [](#prerequisites)Prerequisites Before running the lab, ensure you have the following installed on your host machine: - [Docker and Docker Compose](https://docs.docker.com/compose/install/) - [Node.js](https://nodejs.org/en/download/package-manager/) ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `clients/stock-market-activity/nodejs/` directory: ```bash cd redpanda-labs/clients/stock-market-activity/nodejs ``` 3. Install the required dependencies: ```bash npm i ``` 4. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 5. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 6. Start a local Redpanda cluster: ```bash docker compose -f ../../../docker-compose/single-broker/docker-compose.yml up -d ``` 7. Start the producer: ```bash node producer.js --brokers localhost:19092 ``` You should see the messages that the producer is sending to Redpanda: ```json Produced: {"Date":"10/22/2013","Close/Last":"$40.45","Volume":"8347540","Open":"$39.95","High":"$40.54","Low":"$39.80"} ``` 8. Press Ctrl+C to stop the script. 9. Open Redpanda Console at [localhost:8080](http://localhost:8080/topics/market_activity). The producer sent the stock market data in the CSV file to the `market_activity` topic in Redpanda. ### [](#options)Options The script supports several command-line options to control its behavior: ```bash node producer.js [options] ``` | Option | Description | | --- | --- | | -h, --help | Display the help message and exit. | | -f, --file, --csv | Specify the path to the CSV file to be processed. Defaults to ../data/market_activity.csv. | | -t, --topic | Specify the topic to which events will be published. Defaults to the name of the CSV file (without its extension). | | -b, --broker, --brokers | Comma-separated list of the host and port for each Redpanda broker. Defaults to localhost:9092. | | -d, --date | Specify the column in the CSV file that contains date information. By default, the script converts these dates to ISO 8601 format. If the looping option (-l) is enabled, the script will increment each date by one day for each iteration of the loop, allowing for dynamic time series simulation. | | -r, --reverse | Read the file into memory and reverse the order of data before sending it to Redpanda. When used with the -l option, data is reversed only once before the looping starts, not during each loop iteration. | | -l, --loop | Continuously loop through the file, reading it into memory and sending data to Redpanda in a loop. When combined with the -d option, it modifies the specified date column by incrementally increasing the date with each loop iteration, simulating real-time data flow over days. When used with -r, the data order is reversed initially, and then the loop continues with the reversed data set. | ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose -f ../../../docker-compose/single-broker/docker-compose.yml down -v ``` --- # Page 13: Stream Stock Market Data from a CSV file Using Python **URL**: https://docs.redpanda.com/redpanda-labs/clients/stock-market-activity-python.md --- # Stream Stock Market Data from a CSV file Using Python --- title: Stream Stock Market Data from a CSV file Using Python latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: stock-market-activity-python page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: stock-market-activity-python.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/clients/pages/stock-market-activity-python.adoc description: Stream data from a CSV file into a Redpanda topic. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This lab demonstrates how to use a Python Kafka producer to stream data from a CSV file into a Redpanda topic. The script simulates real-time stock market activity by pushing JSON formatted messages into a topic. ```json {"Date":"10/22/2013","Close/Last":"$40.45","Volume":"8347540","Open":"$39.95","High":"$40.54","Low":"$39.80"} ``` This script allows you to loop through data continuously, reverse the order of data for different viewing perspectives, and manipulate date columns for time-series analysis. In this lab, you will: - Run the producer that streams data from a CSV file directly into a Redpanda topic. - Discover methods to alter the data stream, such as reversing the data sequence or looping through the data continuously for persistent simulations. - Adjust date fields dynamically to represent different time frames for analysis. ## [](#prerequisites)Prerequisites Before running the lab, ensure you have the following installed on your host machine: - [Docker and Docker Compose](https://docs.docker.com/compose/install/) - [Python3](https://www.python.org/downloads) ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `clients/stock-market-activity/python/` directory: ```bash cd redpanda-labs/clients/stock-market-activity/python ``` 3. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Start a local Redpanda cluster: ```bash docker compose -f ../../../docker-compose/single-broker/docker-compose.yml up -d ``` 6. Create a virtual environment: ```bash python3 -m venv .env source .env/bin/activate ``` 7. Install the required dependencies: ```bash pip3 install --upgrade pip pip3 install -r requirements.txt ``` 8. Start the producer: ```bash python producer.py --brokers localhost:19092 ``` You should see that the producer is sending messages to Redpanda: Message delivered to market\_activity \[0\] offset 0 9. Open Redpanda Console at [localhost:8080](http://localhost:8080/topics/market_activity). The producer sent the stock market data in the CSV file to the `market_activity` topic in Redpanda. ### [](#options)Options The script supports several command-line options to control its behavior: ```bash python producer.py [options] ``` | Option | Description | | --- | --- | | -h, --help | Display the help message and exit. | | -f, --file, --csv | Specify the path to the CSV file to be processed. Defaults to ../data/market_activity.csv. | | -t, --topic | Specify the topic to which events will be published. Defaults to the name of the CSV file (without its extension). | | -b, --broker, --brokers | Comma-separated list of the host and port for each Redpanda broker. Defaults to localhost:9092. | | -d, --date | Specify the column in the CSV file that contains date information. By default, the script converts these dates to ISO 8601 format. If the looping option (-l) is enabled, the script will increment each date by one day for each iteration of the loop, allowing for dynamic time series simulation. | | -r, --reverse | Read the file into memory and reverse the order of data before sending it to Redpanda. When used with the -l option, data is reversed only once before the looping starts, not during each loop iteration. | | -l, --loop | Continuously loop through the file, reading it into memory and sending data to Redpanda in a loop. When combined with the -d option, it modifies the specified date column by incrementally increasing the date with each loop iteration, simulating real-time data flow over days. When used with -r, the data order is reversed initially, and then the loop continues with the reversed data set. | ## [](#clean-up)Clean up To exit the virtual environment: ```bash deactivate ``` To shut down and delete the containers along with all your cluster data: ```bash docker compose -f ../../../docker-compose/single-broker/docker-compose.yml down -v ``` --- # Page 14: Stream Text Embeddings with Redpanda, OpenAI, and MongoDB **URL**: https://docs.redpanda.com/redpanda-labs/connect-plugins/openai.md --- # Stream Text Embeddings with Redpanda, OpenAI, and MongoDB --- title: Stream Text Embeddings with Redpanda, OpenAI, and MongoDB latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: openai page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: openai.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/connect-plugins/pages/openai.adoc description: Build a streaming RAG pipeline with Redpanda, OpenAI, and MongoDB Atlas page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- In this lab, you’ll build a [retrieval augmented generation](https://help.openai.com/en/articles/8868588-retrieval-augmented-generation-rag-and-semantic-search-for-gpts) (RAG) pipeline to enhance natural language understanding and response generation using Redpanda, [OpenAI](https://openai.com/), [MongoDB Atlas](https://www.mongodb.com/products/platform/atlas-vector-search), and [LangChain](https://www.langchain.com/) . This RAG pipeline comprises of two phases: - **Acquisition and persistence of new information**: In this initial phase, Langchain is used to facilitate the acquisition of new information and prepare it for ingestion into Redpanda. The Redpanda Platform adds [OpenAI text embeddings](https://platform.openai.com/docs/guides/embeddings) to messages as they stream through Redpanda on their way to a MongoDB Atlas vector database. Redpanda handles real-time data ingestion and storage, while Redpanda Connect ensures efficient communication with MongoDB Atlas. The acquired information, such as documents and webpages, is split into smaller text chunks and stored in MongoDB Atlas along with their vector embeddings. These embeddings, which encode the semantic meaning of text in a multidimensional space, enable efficient semantic search. MongoDB Atlas enables queries based on vector embeddings to retrieve texts with similar semantic meaning. - **Retrieval of relevant contextual information**: In this phase, contextual information relevant to the user’s question (prompt) is retrieved from MongoDB Atlas through semantic search. This contextual information is then passed alongside the user’s question to OpenAI’s large language model. OpenAI’s language model leverages this additional context to improve the quality and relevance of its generated answers. This retrieval and augmentation of contextual information enhance the model’s understanding and enable it to produce more accurate and contextually relevant responses. ## [](#prerequisites)Prerequisites You must have the following: - [Redpanda Cloud account](https://cloud.redpanda.com/sign-up) - [OpenAI developer platform account](https://platform.openai.com/signup/) > 📝 **NOTE** > > Make sure your account has [available credits](https://help.openai.com/en/articles/9038407-how-can-i-set-up-billing-for-my-account). - [MongoDB Atlas account](https://account.mongodb.com/account/register) - [Python 3](https://www.python.org/downloads) - [rpk](https://docs.redpanda.com/current/get-started/rpk-install/) ## [](#set-up-a-local-environment)Set up a local environment 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `redpanda-labs/connect-plugins/processor/embeddings/openai/` directory: ```bash cd redpanda-labs/connect-plugins/processor/embeddings/openai ``` ## [](#set-up-redpanda-serverless)Set up Redpanda Serverless 1. Log in to your Redpanda Cloud account and create a new [Serverless Standard](https://redpanda.com/redpanda-cloud/serverless) cluster. 2. Make a note of the bootstrap server URL. 3. Create a topic called `documents` with the default settings. 4. Create a new user with permissions (ACLs) to access a topic named `documents` and a consumer group named `connect`. 5. Add the cluster connection information to a local `.env` file: ```bash cat > .env<< EOF REDPANDA_SERVERS="" REDPANDA_USER="" REDPANDA_PASS="" REDPANDA_TOPICS="documents" EOF ``` ## [](#set-up-openai-api)Set up OpenAI API 1. Log in to your OpenAI developer platform account and create a new [Project API key](https://platform.openai.com/api-keys). 2. Add the secret key to the local `.env` file: ```bash cat >> .env<< EOF OPENAI_API_KEY="" OPENAI_EMBEDDING_MODEL="text-embedding-3-small" OPENAI_MODEL="gpt-4o" EOF ``` ## [](#set-up-mongodb-atlas)Set up MongoDB Atlas 1. Log in to your MongoDB Atlas account and deploy a new [free cluster](https://www.mongodb.com/docs/atlas/getting-started) for development purposes. 2. Create a new database named `VectorStore`, a new collection in that database named `Embeddings`, and an Atlas Vector Search index with the following JSON configuration: ```json { "fields": [ { "numDimensions": 1536, "path": "embedding", "similarity": "euclidean", "type": "vector" } ] } ``` 3. Add the Atlas connection information to the local `.env` file: ```bash cat >> .env<< EOF # Connection string for MongoDB Driver for Go: ATLAS_CONNECTION_STRING="" ATLAS_DB="VectorStore" ATLAS_COLLECTION="Embeddings" ATLAS_INDEX="vector_index" EOF ``` ## [](#set-the-environment-variables)Set the environment variables Your `.env` file should now look like this: ```bash REDPANDA_SERVERS="" REDPANDA_USER="" REDPANDA_PASS="" REDPANDA_TOPICS="documents" OPENAI_API_KEY="" OPENAI_EMBEDDING_MODEL="text-embedding-3-small" OPENAI_MODEL="gpt-4o" ATLAS_CONNECTION_STRING="" ATLAS_DB="VectorStore" ATLAS_COLLECTION="Embeddings" ATLAS_INDEX="vector_index" ``` To check your `.env` file: ```bash cat .env ``` ## [](#create-a-python-virtual-environment)Create a Python virtual environment Create the Python virtual environment in the current directory: ```bash python3 -m venv env source env/bin/activate pip install -r requirements.txt exit ``` ## [](#run-the-lab)Run the lab This lab has three parts: 1. Use **LangChain’s** `WebBaseLoader` and `RecursiveCharacterTextSplitter` to generate chunks of text from the BBC Sport website and send each chunk to a Redpanda topic named `documents`. 2. Use **Redpanda Connect** to consume the messages from the `documents` topic and pass each message through a processor that calls **OpenAI’s embeddings API** to retrieve the vector embeddings for the text. The enriched messages are then inserted into a **MongoDB Atlas** database collection that has a vector search index. 3. Complete the RAG pipeline by using **LangChain** to retrieve similar texts from the **MongoDB Atlas** database and add that context alongside a user question to a prompt that is sent to OpenAI’s new `gpt-4o` model. ### [](#start-redpanda-connect)Start Redpanda Connect Start Redpanda Connect with the custom OpenAI processor: ```bash rpk connect run --env-file .env --log.level debug atlas_demo.yaml ``` You should see the following in the output: ```bash INFO Running main config from specified file @service=redpanda-connect redpanda_connect_version=v4.33.0 path=atlas_demo.yaml INFO Listening for HTTP requests at: http://0.0.0.0:4195 @service=redpanda-connect DEBU url: https://api.openai.com/v1/embeddings, model: text-embedding-3-small @service=redpanda-connect label="" path=root.pipeline.processors.0 INFO Launching a Redpanda Connect instance, use CTRL+C to close @service=redpanda-connect INFO Input type kafka is now active @service=redpanda-connect label="" path=root.input DEBU Starting consumer group @service=redpanda-connect label="" path=root.input INFO Output type mongodb is now active @service=redpanda-connect label="" path=root.output ``` ### [](#generate-new-text-documents)Generate new text documents In another terminal window, generate new text documents and send them to Atlas through Redpanda Connect for embeddings: ```bash source env/bin/activate # Single webpage: python produce_documents.py -u "https://www.bbc.co.uk/sport/football/articles/c3gglr8mpzdo" # Entire sitemap: python produce_documents.py -s "https://www.bbc.com/sport/sitemap.xml" ``` You can view the text and embeddings in the [Atlas console](https://cloud.mongodb.com). ### [](#run-the-retrieval-and-generation-chain)Run the retrieval and generation chain Run the retrieval chain and ask OpenAI a question: ```bash source env/bin/activate python retrieve_generate.py -q """ Which football players made the provisional England national squad for the Euro 2024 tournament, and on what date was this announced? """ ``` It takes a few seconds for the following response to appear in the output: **Question**: Which football players made the provisional England national squad for the Euro 2024 tournament, and on what date was this announced? **Initial answer**: As of my knowledge cutoff date in October 2023, the provisional England national squad for the Euro 2024 tournament has not been announced. The selection of national teams for major tournaments like the UEFA European Championship typically happens closer to the event, often just a few weeks before the tournament starts. For the most current information, I recommend checking the latest updates from the Football Association (FA) or other reliable sports news sources. **Augmented answer**: The provisional England national squad for the Euro 2024 tournament includes the following players: **Goalkeepers**: - Dean Henderson (Crystal Palace) - Jordan Pickford (Everton) - Aaron Ramsdale (Arsenal) - James Trafford (Burnley) **Defenders**: - Jarrad Branthwaite (Everton) - Lewis Dunk (Brighton) - Joe Gomez (Liverpool) - Marc Guehi (Crystal Palace) - Ezri Konsa (Aston Villa) - Harry Maguire (Manchester United) - Jarell Quansah (Liverpool) - Luke Shaw (Manchester United) - John Stones (Manchester City) - Kieran Trippier (Newcastle) - Kyle Walker (Manchester City) **Midfielders**: - Trent Alexander-Arnold (Liverpool) - Conor Gallagher (Chelsea) - Curtis Jones (Liverpool) - Kobbie Mainoo (Manchester United) - Declan Rice (Arsenal) - Adam Wharton (Crystal Palace) **Forwards**: - Jude Bellingham (Real Madrid) - Jarrod Bowen (West Ham) - Eberechi Eze (Crystal Palace) - Phil Foden (Manchester City) - Jack Grealish (Manchester City) - Anthony Gordon (Newcastle) - Harry Kane (Bayern Munich) - James Maddison (Tottenham) - Cole Palmer (Chelsea) - Bukayo Saka (Arsenal) - Ivan Toney (Brentford) - Ollie Watkins (Aston Villa) This announcement was made on May 21, 2024. ## [](#next-steps)Next steps Learn more about [Redpanda Connect](../../../redpanda-connect/get-started/about/) and explore the other [available connectors](../../../redpanda-connect/components/about/). --- # Page 15: Contribute Docs for Redpanda Labs **URL**: https://docs.redpanda.com/redpanda-labs/contribute.md --- # Contribute Docs for Redpanda Labs --- title: Contribute Docs for Redpanda Labs latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: contribute page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: contribute.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/ROOT/pages/contribute.adoc page-git-created-date: "2024-02-09" page-git-modified-date: "2024-02-26" --- Welcome to the Redpanda Labs documentation guide! Whether you’re new to documenting open-source projects or an experienced contributor, this guide will help you understand how to create and contribute high-quality documentation for Redpanda Labs. Before you begin, familiarize yourself with the basics of [Asciidoc](https://asciidoctor.org/docs/what-is-asciidoc/) and [Antora](https://docs.antora.org/), the tools we use to write and organize our documentation. We recommend using [Visual Studio Code](https://code.visualstudio.com/download) with the [Asciidoc extension](https://marketplace.visualstudio.com/items?itemName=asciidoctor.asciidoctor-vscode) to edit documentation. The Asciidoc extension provides useful features such as folding conditionals and titles to make it easier to work with large documents. ## [](#for-github-users)For GitHub users Every lab directory should feature a `README.adoc` file (README) at its root. The README serves as the primary point of interaction for users on GitHub. This document should provide docs for the lab, including: - **Overview**: A brief introduction to what the lab is about and what it aims to demonstrate or achieve. - **Prerequisites**: Any requirements or setup steps needed before running the lab. - **Run the lab**: Step-by-step instructions on how to execute the lab, including commands, configurations, and any specific notes or warnings. - **Clean up**: Any required steps to stop the lab and uninstall any dependencies. ## [](#publish)Publish on Redpanda docs Labs can be published on the [official Redpanda docs site](https://docs.redpanda.com/redpanda-labs/) by following a specific directory structure for Antora. When you publish labs on the Redpanda docs site, they are automatically indexed to make them searchable through Algolia and to enhance discoverability. Documentation metadata, defined in the [Asciidoc header](#attributes) of each page, generates search filters and automates cross-linking between related documents. Example metadata includes: ```yaml :page-layout: lab :page-categories: Development, Stream Processing :env-docker: true ``` If a lab page falls into the same categories as a doc page and the deployment types of both the doc page and the lab page match, those pages are considered related and cross-links are automatically added. ### [](#create-the-documentation-structure)Create the documentation structure Your lab’s documentation should be placed within the `docs/` directory. Required directory structure 📒 redpanda-labs-repo 📂 docs **(1)** 📄 antora.yml **(2)** 📂 modules 📂 **(3)** 📁 attachments **(4)** 📁 examples **(5)** 📁 images **(6)** 📁 pages **(7)** 📁 partials **(8)** | 1 | (Required) The docs/ directory stores all Antora content for docs. | | --- | --- | | 2 | (Required) A component version descriptor file that indicates to Antora that the contents should be collected and processed. | | 3 | (Required) This named module directory is where you can place all your documentation. | | 4 | (Optional) The attachments/ directory stores files to be uploaded as attachments. | | 5 | (Optional) The examples/ directory stores code files to be included in the documentation. | | 6 | (Optional) The images/ directory stores images to be included in the documentation. | | 7 | (Required) The pages/ directory stores your AsciiDoc documentation pages. | | 8 | (Optional) The partials/ directory is where you can store reusable snippets of AsciiDoc content to be included in the documentation. | ### [](#avoid-duplication-in-the-readme)Avoid duplication in the README To avoid duplicating content in both the README and the Antora docs you can symlink the README into the `docs/modules/pages` directory. Symlinks are a powerful tool for managing documentation efficiently, allowing you to maintain a single source of truth while ensuring your content is accessible both on GitHub and within the Redpanda docs site. You can symlink not only README files but also example code, images, and attachments so that all relevant documentation components are seamlessly integrated and accessible. To create a symlink for your README, execute the following CLI command: > 📝 **NOTE** > > This command is not supported on Windows. If you’re using Windows, create the symlinks manually. See the [Antora documentation](https://docs.antora.org/antora/latest/symlinks/#windows) for instructions. ```bash npx doc-tools link-readme -s -t ``` Replace `` with the name of the directory where your README is saved. Replace `` with the Asciidoc filename that you want to generate. To create symlinks for other files, such as images or example code, follow these steps: 1. Change into the desired location in the `docs/` directory. 2. Create relative symlinks to the target content files. This enables you to reference the same content in multiple places without duplication. For guidelines on creating symlinks that comply with Antora’s requirements, see the [Antora docs](https://docs.antora.org/antora/latest/symlinks/). #### [](#use-conditionals-for-platform-specific-content)Use conditionals for platform-specific content When you symlink the README file, the content is shared between GitHub and the Redpanda docs site. This means that any changes you make to the README will be reflected in both locations. However, this can lead to issues if you need to include platform-specific content or images that are only relevant to one of the platforms. For example, if you have images or content that is only relevant to the Redpanda docs site, you can use Asciidoc conditionals to include or exclude that content based on the platform. This is particularly useful when you want to maintain a single source of truth for your documentation while ensuring that the content is tailored to the specific needs of each platform. [AsciiDoc conditionals](https://docs.asciidoctor.org/asciidoc/latest/directives/conditionals/) offer a straightforward solution to this requirement, enabling you to include or exclude specific content based on the environment in which the document is rendered. For example, a common use case for conditionals is adding images. On GitHub, you add images by referencing a relative path to the image such as `image::../../images/some-image.png`. But, for the documentation site, images must be in the Antora structure and you must use [Antora resource IDs](https://docs.antora.org/antora/latest/page/image-resource-id-examples/) to reference images such as `image:::some-image.png`. To handle this difference in referencing image paths, you can keep images in the Antora structure and use conditionals to set the relative path from the source README to the images directory for when the content is rendered on GitHub. To conditionally render content based on whether the document is viewed on GitHub or on the Redpanda docs site, use the `env-github` and `env-site` attributes. The `env-github` attribute is automatically set when viewing on GitHub, allowing for easy differentiation. For example, if you have a directory structure like this where the images are in the Antora `images/` directory: 📒 redpanda-labs-repo 📂 📄 README.adoc 📂 docs 📄 antora.yml 📂 modules 📂 📁 images 📄 some-image.png 📁 pages 📄 my-doc.adoc (symlinked) ```asciidoc ifndef::env-site[] :imagesdir: ../docs/modules//images/ endif::[] image::some-image.png[] ``` ### [](#attributes)Add attributes to pages When contributing documentation, make sure to add the following attributes to your pages to categorize and identify your content: - `page-categories`: Assigns [categories](#categories) to your page. Use a comma-separated list for multiple categories. Categories are validated against a [centralized list](https://github.com/redpanda-data/docs/blob/shared/modules/ROOT/partials/valid-categories.yml). These categories are used to generate links to related docs and related labs as well as provide filters on the Redpanda Labs landing page. - `env-kubernetes`, `env-docker`, `page-cloud`: Indicates the deployment environment or platform your lab is designed for. - `page-layout: lab`: Specifies the page layout template to be used, indicating that the page is part of Redpanda Labs. For example: ```asciidoc :page-layout: lab :page-categories: Development, Stream Processing :env-docker: true ``` ### [](#categories)Manage and define categories Documentation categories are a crucial part of organizing content in a way that is intuitive and accessible to users. Categories ensure consistency across the Redpanda docs and labs, facilitating easier navigation and a better understanding of the content structure. #### [](#central-repository-for-categories)Central repository for categories The categories for Redpanda docs are centrally managed in a YAML file located in the [Redpanda docs repository](https://github.com/redpanda-data/docs/blob/shared/modules/ROOT/partials/valid-categories.yml). This centralized approach allows the documentation team to maintain a coherent structure across all documentation, ensuring that every topic is appropriately categorized. #### [](#contribute-to-category-definitions)Contribute to category definitions The Redpanda docs team welcomes contributions and suggestions for improving or expanding the category definitions. If you have ideas for new categories or adjustments to existing ones that could enhance the organization and discoverability of content, we encourage you to contribute in the following ways: 1. Open a pull request. If you’re familiar with the structure of the YAML file and have a specific change in mind, the most direct way to propose a category update is by opening a pull request against the [`valid-categories.yml` file](https://github.com/redpanda-data/docs/blob/shared/modules/ROOT/partials/valid-categories.yml). Include a brief explanation of your proposed changes and how they improve the documentation structure. 2. Create an issue. If you’re less comfortable making direct changes or if your suggestion requires broader discussion, you can [open an issue](https://github.com/redpanda-data/documentation-private/issues/new/choose) in the private Redpanda docs repository. In your issue, describe the proposed category addition or modification, providing context on why the change is beneficial and how it fits within the overall documentation strategy. #### [](#guidelines-for-proposing-categories)Guidelines for proposing categories When suggesting new categories or modifications to existing ones, consider the following guidelines to ensure your proposal aligns with the documentation goals: - **Relevance**: Categories should be directly relevant to Redpanda and its ecosystem, reflecting topics that users are likely to search for. - **Clarity**: Category names and definitions should be clear and self-explanatory, avoiding jargon where possible. - **Consistency**: Proposals should maintain consistency with existing categories, fitting logically within the overall structure. - **Breadth vs depth**: Aim for categories that are broad enough to encompass multiple related topics but specific enough to be meaningful and useful for navigation. ### [](#build-and-test-your-changes-locally)Build and test your changes locally You should build and preview the docs on your local machine to see your changes before going live. 1. Make sure you have [Node.js](https://nodejs.org/en/download) 16 or higher installed on your machine. ```bash node --version ``` If this command fails, you don’t have Node.js installed. 2. Install dependencies. ```bash npm install && npm update ``` 3. Build the site. ```bash npm run build ``` The `build` script generates the site HTML, CSS and JavaScript files. Now, you can serve them locally using a local web server. 4. Serve the site: ```bash npm run serve ``` The web server’s host URL is printed to the console. 5. Use Ctrl+C to stop the process. ## [](#documentation-guidelines)Documentation guidelines For rules and recommendations as well as help with Asciidoc syntax, see the [Redpanda docs style guide](https://github.com/redpanda-data/docs-site/blob/main/meta-docs/STYLE-GUIDE.adoc). In general: - Keep your language simple and accessible. - Use code blocks and screenshots where applicable to illustrate your points. - Organize content logically, using headings to break up sections for easy navigation. - When documenting code examples, explain not just the "how" but also the "why" behind the code. - Review your documentation for clarity and accuracy before submitting. ## [](#community)Community Discussions about Redpanda Labs take place on this repository’s [issues](https://github.com/redpanda-data/redpanda-labs/issues) and the [Redpanda community Slack](https://redpanda.com/slack). --- # Page 16: Flatten JSON Messages **URL**: https://docs.redpanda.com/redpanda-labs/data-transforms/flatten-go.md --- # Flatten JSON Messages --- title: Flatten JSON Messages latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: flatten-go page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: flatten-go.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/data-transforms/pages/flatten-go.adoc description: Flatten JSON messages in topics using data transforms. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This example uses Redpanda data transforms to take JSON messages in an input topic and flatten them using a customizable delimiter. Example input topic ```json { "content": { "id": 123, "name": { "first": "Dave", "middle": null, "last": "Voutila" }, "data": [1, "fish", 2, "fish"] } } ``` Example output topic with flattened JSON ```json { "content.id": 123, "content.name.first": "Dave", "content.name.middle": null, "content.name.last": "Voutila", "content.data": [1, "fish", 2, "fish"] } ``` ## [](#prerequisites)Prerequisites You must have the following: - At least version 1.20 of [Go](https://go.dev/doc/install) installed on your host machine. - [Install `rpk`](../../../current/get-started/rpk-install/) on your host machine. - [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. ## [](#limitations)Limitations - Arrays of objects are currently untested. - Providing a series of objects as input, not an an array, may result in a series of flattened objects as output. - Due to how JSON treats floating point values, values such as `1.0` that can be converted to an integer will lose the decimal point. For example `1.0` becomes `1`. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `data-transforms/flatten/` directory: ```bash cd redpanda-labs/data-transforms/go/flatten ``` 3. Set the `REDPANDA_VERSION` environment variable to at least version v23.3.1. Data transforms was introduced in this version. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Start Redpanda in Docker by running the following command: ```bash docker compose up -d --wait ``` 6. Set up your rpk profile: ```bash rpk profile create flatten --from-profile profile.yml ``` 7. Create the required topics `iss_json` and `iss_avro`: ```bash rpk topic create src sink ``` 8. Build and deploy the transforms function: ```bash rpk transform build rpk transform deploy --input-topic=src --output-topic=sink ``` This example accepts the following environment variables: - `RP_FLATTEN_DELIM`: The delimiter to use when flattening the JSON fields. Defaults to `.`. For example: ```bash rpk transform deploy --var "RP_FLATTEN_DELIM=" ``` 9. Produce a JSON message to the source topic: ```bash rpk topic produce src ``` 10. Paste the following into the prompt and press Ctrl+C to exit: ```json {"message": "success", "timestamp": 1707743943, "iss_position": {"latitude": "-28.5723", "longitude": "-149.4612"}} ``` 11. Consume the sink topic to see the flattened result: ```bash rpk topic consume sink --num 1 ``` { "topic": "sink", "value": "{\\n \\"message\\": \\"success\\" \\"timestamp\\": 1.707743943e+09 \\"iss\_position.latitude\\": \\"-28.5723\\",\\n \\"iss\_position.longitude\\": \\"-149.4612\\"\\n}\\n", "timestamp": 1707744765541, "partition": 0, "offset": 0 } You can also see this in [Redpanda Console](http://localhost:8080/topics/sink?p=-1&s=50&o=-1#messages). ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 17: Convert JSON Messages into Avro **URL**: https://docs.redpanda.com/redpanda-labs/data-transforms/issdemo-go.md --- # Convert JSON Messages into Avro --- title: Convert JSON Messages into Avro latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: issdemo-go page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: issdemo-go.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/data-transforms/pages/issdemo-go.adoc description: Query live tracking data from the International Space Station and convert it from JSON to Avro using data transforms. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This example shows you how to query live tracking data from the International Space Station and convert it from JSON to Avro using Redpanda data transforms. This example uses cURL to query data from `api.open-notify.org` which is then piped through to Redpanda using the `rpk` command-line client. When the data is in Redpanda, it’s converted from JSON to Avro using the transforms function. Then, you can see the converted data in Redpanda Console. ![Architectural Overview](../_images/iss_overview.png) ## [](#prerequisites)Prerequisites You must have the following: - At least version 1.20 of [Go](https://go.dev/doc/install) installed on your host machine. - [Install `rpk`](https://docs.redpanda.com/current/get-started/rpk-install/) on your host machine. - [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `data-transforms/iss_demo/` directory: ```bash cd redpanda-labs/data-transforms/go/iss_demo ``` 3. Set the `REDPANDA_VERSION` environment variable to at least version v23.3.1. Data transforms was introduced in this version. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Start Redpanda in Docker by running the following command: ```bash docker compose up -d --wait ``` 6. Post the Avro schema to the Schema Registry using a cURL command: ```bash ./post-schema.sh ``` Take a note of the schema ID that is returned from this command. In a clean environment this will be `1`. 7. Set up your rpk profile: ```bash rpk profile create iss_demo --from-profile profile.yml ``` Created and switched to new profile "iss\_demo". 8. Create the required topics `iss_json` and `iss_avro`: ```bash rpk topic create iss_json iss_avro ``` 9. Deploy the transforms function: ```bash rpk transform build rpk transform deploy --var=SCHEMA_ID=1 --input-topic=iss_json --output-topic=iss_avro ``` This example accepts the following environment variables: - `SCHEMA_ID` (**required**): The ID of the Avro schema stored in the Redpanda schema registry. Now, you can test that the data can be converted from JSON to Avro. 1. Get a single record representing the location of the ISS: ```bash curl http://api.open-notify.org/iss-now.json ``` Example output: ```json {"message": "success", "timestamp": 1695753164, "iss_position": {"latitude": "-12.8784", "longitude": "92.2935"}} ``` 2. Run `rpk topic produce`: ```bash rpk topic produce iss_json ``` 3. Paste the output of the cURL command into the prompt and press Ctrl+C to exit the prompt. 4. Consume the Avro topic using `rpk topic consume` and observe that the transforms function has converted it to Avro: ```bash rpk topic consume iss_avro --num 1 ``` Example output: ```json { "topic": "iss_avro", "value": "\u0000\u0000\u0000\u0000\u0001\ufffd\ufffd\u0011\ufffd\ufffd\ufffd)\ufffd\u0010X9\ufffd\ufffd\u0012W@\ufffd\ufffd\ufffd\ufffd\u000c", "timestamp": 1695753212929, "partition": 0, "offset": 0 } ``` 5. Open [Redpanda Console](http://localhost:8080/topics/iss_avro?p=-1&s=50&o=-1#messages) to view the decoded data. ![Redpanda Console showing the decoded message](../_images/iss_console.png) ## [](#files-in-the-example)Files in the example - [`iss.avsc`](https://github.com/redpanda-data/redpanda-labs/blob/main/data-transforms/iss_demo/iss.avsc): Avro schema used for conversion. - [`profile.yml`](https://github.com/redpanda-data/redpanda-labs/blob/main/data-transforms/iss_demo/profile.yml): Used to configure `rpk` with the `rpk profile` command. - [`transform.go`](https://github.com/redpanda-data/redpanda-labs/blob/main/data-transforms/iss_demo/transform.go): This is the Golang code that will be compiled to WebAssembly. This code: - Initializes the transform, including getting the schema from the Schema Registry and creating the `goavro` codec object (both stored in global variables). - Registers the callback `toAvro`. - `toAvro` parses the JSON into a struct `iss_now`, converts the struct into a map and then converts the map to Avro binary using the `goavro` codec. - Prepends the schema ID using the magic five bytes `0x0` followed by a BigEndian `uint32`. - This is all appended to the output slice. ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` ## [](#next-steps)Next steps You could set up a loop to poll the location of the ISS and produce it to the `iss_json` topic. For example: ```bash while true do line=`curl http://api.open-notify.org/iss-now.json -s` echo $line | rpk topic produce iss_json sleep 1 done ``` --- # Page 18: Redact Information in JSON Messages **URL**: https://docs.redpanda.com/redpanda-labs/data-transforms/redaction-go.md --- # Redact Information in JSON Messages --- title: Redact Information in JSON Messages latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: redaction-go page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: redaction-go.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/data-transforms/pages/redaction-go.adoc description: Redact personally identifiable information (PII) in topics using data transforms. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This example shows you how to use Redpanda data transforms to redact information in JSON messages. The demo runs using Docker Compose, with the following containers: - `redpanda`: Includes a single Redpanda broker. - `console`: Includes Redpanda Console. - `owl-shop`: Includes the Owlshop demo application that produces e-commerce data to the Redpanda broker. - `transform`: Includes all the requirements for deploying the transforms function to the Redpanda broker, including `rpk`, Go, and the redaction transform code. The source code for the redaction transform is available in the `redaction` directory. ## [](#prerequisites)Prerequisites You must have [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. ## [](#run-the-lab)Run the lab 1. Clone the repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `data-transforms/redaction/demo/` directory: ```bash cd redpanda-labs/data-transforms/go/redaction/demo ``` 3. Set the `REDPANDA_VERSION` environment variable to at least version v23.3.1. Data transforms was introduced in this version. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/console/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Build the data transforms container: ```bash docker compose build ``` 6. If you are running on an ARM-based device such as the Apple M1 chip, open the `docker-compose.yml` file and uncomment the `platform: 'linux/amd64'` line. 7. Start the containers: ```bash docker compose up --detach --wait ``` 8. Navigate to [http://localhost:8080](http://localhost:8080) to see the Redpanda Console. 9. Go to **Topics** and select **owlshop-orders-redacted** and see the redacted orders. ## [](#clean-up)Clean up To stop the containers: ```shell docker compose down ``` --- # Page 19: Filter Messages into a New Topic using a Regex **URL**: https://docs.redpanda.com/redpanda-labs/data-transforms/regex-go.md --- # Filter Messages into a New Topic using a Regex --- title: Filter Messages into a New Topic using a Regex latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: regex-go page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: regex-go.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/data-transforms/pages/regex-go.adoc description: Filter messages from one topic into another using regular expressions (regex) and data transforms. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This is an example of how to filter messages from one topic into another using regular expressions (regex) and Redpanda data transforms. If a source topic contains a key or value that matches the regex, it will be produced to the sink topic. Regexes are implemented using Go’s `regexp` library, which uses the same syntax as RE2. See the [RE2 wiki](https://github.com/google/re2/wiki/Syntax) for help with syntax. The regex used in this example matches the typical email address pattern. ## [](#prerequisites)Prerequisites You must have the following: - At least version 1.20 of [Go](https://go.dev/doc/install) installed on your host machine. - [Install `rpk`](https://docs.redpanda.com/current/get-started/rpk-install/) on your host machine. - [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `data-transforms/go/regex/` directory: ```bash cd redpanda-labs/data-transforms/regex ``` 3. Set the `REDPANDA_VERSION` environment variable to at least version v23.3.1. Data transforms was introduced in this version. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Start Redpanda in Docker by running the following command: ```bash docker compose up -d --wait ``` 6. Set up your rpk profile: ```bash rpk profile create regex --from-profile profile.yml ``` 7. Create the required topics: ```bash rpk topic create src sink ``` 8. Build the transforms function: ```bash rpk transform build ``` 9. Deploy the transforms function: ```bash ./deploy-transform.sh ``` See the file `deploy-transform.sh` to understand the regex used in the transform. Only input that matches the regular expression will be transformed. This example accepts the following environment variables: - `PATTERN` (**required**): The regex to match against records. Here, the regex finds messages containing email addresses. - `MATCH_VALUE`: By default, the regex matches record keys, but if set to `true`, the regex will match values. 10. Run `rpk topic produce`: ```bash rpk topic produce src ``` 11. Paste the following into the prompt and press Ctrl+C to exit: ```json Hello, please contact us at help@example.com. Hello, please contact us at support.example.com. Hello, please contact us at help@example.edu. ``` 12. Consume the sink topic to see that input lines containing email addresses were extracted and produced to the sink topic: ```bash rpk topic consume sink --num 2 ``` { "topic": "sink", "value": "Hello, please contact us at help@example.com.", "timestamp": 1714525578013, "partition": 0, "offset": 0 } { "topic": "sink", "value": "Hello, please contact us at help@example.edu.", "timestamp": 1714525579192, "partition": 0, "offset": 1 } > 📝 **NOTE** > > The second input line, `Hello, please contact us at support.example.com.`, is not in the sink topic because it did not match the regex that identifies valid email addresses. You can also see the `sink` topic contents in [Redpanda Console](http://localhost:8080/topics/sink?p=-1&s=50&o=-1#messages). Switch to the [`src` topic](http://localhost:8080/topics/src?p=-1&s=50&o=-1#messages) to see all of the events, including the one that does not match the regex and is not in the `sink` topic. ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 20: Convert Timestamps using Rust **URL**: https://docs.redpanda.com/redpanda-labs/data-transforms/ts-converter-rust.md --- # Convert Timestamps using Rust --- title: Convert Timestamps using Rust latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: ts-converter-rust page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: ts-converter-rust.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/data-transforms/pages/ts-converter-rust.adoc description: Convert timestamps from various forms, such as epochs to strings. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This lab uses data transforms with the Redpanda Schema Registry to convert keys or values with timestamps across various formats. Written in Rust, the example shows how to transform numeric epoch values in milliseconds to string-based formats and vice versa. ## [](#prerequisites)Prerequisites You must have the following: - At least version 1.75 of [Rust](https://rustup.rs/) installed on your host machine. - The Wasm target for Rust installed. To install this target, run the following: ```bash rustup target add wasm32-wasip1 ``` - [Install `rpk`](https://docs.redpanda.com/current/get-started/rpk-install/) on your host machine. - [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `data-transforms/ts-converter/` directory: ```bash cd redpanda-labs/data-transforms/rust/ts-converter ``` 3. Set the `REDPANDA_VERSION` environment variable to at least version 24.1.2. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Start Redpanda in Docker by running the following command: ```bash docker compose up -d --wait ``` 6. Set up your rpk profile: ```bash rpk profile create ts-converter --from-profile profile.yml ``` 7. Create the required input topic: ```bash rpk topic create src sink ``` 8. Create a source schema: ```bash echo '{"type": "long", "name": "epoch"}' | tee epoch.avsc rpk registry schema create src-value --schema epoch.avsc ``` 9. Deploy the transforms function: ```bash rpk transform build rpk transform deploy --file ts-converter.wasm -i src -o sink --var "TIMESTAMP_TARGET_TYPE=string[%+]" ``` This example accepts the following environment variables: - `TIMESTAMP_TARGET_TYPE` (**required**): The output conversion time. Must be one of: - `string[]`: where `` is a valid [chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) string - `unix[]`: where `` is either `seconds`, `milliseconds`, `microseconds`, or `nanoseconds` - `date`: which truncates to just the date portion of the timestamp - `time`: which truncates to just the 24-hour time of the timestamp - `TIMESTAMP_MODE`: one of `key`, `value`, or a fielded version like `value[my-field]` - `TIMESTAMP_STRING_FORMAT`: if your input data is a string, provide the chrono format for parsing 10. Produce an example epoch in milliseconds using `rpk topic produce`: ```bash echo "1704310686988" | rpk topic produce src --schema-id topic ``` 11. Consume the sink topic with `rpk` and using new schema to see the string-based timestamp: ```bash rpk topic consume sink --use-schema-registry=value -n 1 -o -1 ``` { "topic": "sink", "value": "\\"2024-01-03T19:38:06.988+00:00\\"", "timestamp": 1715890281087, "partition": 0, "offset": 0 } You can also see this in [Redpanda Console](http://localhost:8080/topics/sink?p=-1&s=50&o=-1#messages). ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 21: Set Up MySQL CDC with Debezium and Redpanda **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/cdc-mysql-json.md --- # Set Up MySQL CDC with Debezium and Redpanda --- title: Set Up MySQL CDC with Debezium and Redpanda latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cdc-mysql-json page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cdc-mysql-json.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/cdc-mysql-json.adoc description: Use Debezium to capture the changes made to a MySQL database in real time and stream them to Redpanda. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This example demonstrates how to use Debezium to capture the changes made to MySQL in real time and stream them to Redpanda. This ready-to-run Docker Compose setup contains the following containers: - `mysql` container with the `pandashop` database, containing a single table, `orders` - `debezium` container capturing changes made to the `orders` table in real time. - `redpanda` container to ingest change data streams produced by `debezium` For more information about the `pandashop` database schema, see the `/data/mysql_bootstrap.sql` file. ![Example architecture](../_images/mysql-architecture.png) ## [](#prerequisites)Prerequisites You must have [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `docker-compose/cdc/mysql-json/` directory: ```bash cd redpanda-labs/docker-compose/cdc/mysql-json ``` 3. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Run the following in the directory where you saved the Docker Compose file: ```bash docker compose up -d ``` When the `mysql` container starts, the `/data/mysql_bootstrap.sql` file creates the `pandashop` database and the `orders` table, followed by seeding the \` orders\` table with a few records. 5. Log into MySQL: ```sql docker compose exec mysql mysql -u mysqluser -p ``` Provide `mysqlpw` as the password when prompted. 6. Check the content inside the `orders` table: ```sql use pandashop; show tables; select * from orders; ``` This is your source table. 7. Exit MySQL: ```bash exit ``` 8. While Debezium is up and running, create a source connector configuration to extract change data feeds from MySQL. ```bash docker compose exec debezium curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d ' { "name": "mysql-connector", "config": { "connector.class": "io.debezium.connector.mysql.MySqlConnector", "tasks.max": "1", "database.hostname": "mysql", "database.port": "3306", "database.user": "debezium", "database.password": "dbz", "database.server.id": "184054", "topic.prefix": "dbz", "database.include.list": "pandashop", "schema.history.internal.kafka.bootstrap.servers": "redpanda:9092", "schema.history.internal.kafka.topic": "schemahistory.pandashop" } }' ``` You should see the following in the output: HTTP/1.1 201 Created Date: Mon, 12 Feb 2024 16:37:09 GMT Location: http://localhost:8083/connectors/mysql-connector Content-Type: application/json Content-Length: 489 Server: Jetty(9.4.51.v20230217) The `database.*` configurations specify the connectivity details to `mysql` container. The parameter, `schema.history.internal.kafka.bootstrap.servers` points to the `redpanda` broker the connector uses to write and recover DDL statements to the database schema history topic. 9. Wait a minute or two until the connector gets deployed inside Debezium and creates the initial snapshot of change log topics in Redpanda. 10. Check the list of change log topics in `redpanda` by running: ```bash docker compose exec redpanda rpk topic list ``` The output should contain two topics with the prefix `dbz.*` specified in the connector configuration. The topic `dbz.pandashop.orders` holds the initial snapshot of change log events streamed from `orders` table. NAME PARTITIONS REPLICAS connect-status 5 1 connect\_configs 1 1 connect\_offsets 25 1 dbz 1 1 dbz.pandashop.orders 1 1 schemahistory.pandashop 1 1 11. Monitor for change events by consuming the `dbz.pandashop.orders` topic: ```bash docker compose exec redpanda rpk topic consume dbz.pandashop.orders ``` 12. While the consumer is running, open another terminal to insert a record to the `orders` table. ```bash export REDPANDA_VERSION=v26.1.3 docker compose exec mysql mysql -u mysqluser -p ``` Provide `mysqlpw` as the password when prompted. 13. Insert the following record: ```sql use pandashop; INSERT INTO orders (customer_id, total) values (5, 500); ``` This will trigger a change event in Debezium, immediately publishing it to `dbz.pandashop.orders` Redpanda topic, causing the consumer to display a new event in the console. That proves the end-to-end functionality of your CDC pipeline. ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` ## [](#next-steps)Next steps Now that you have change log events ingested into Redpanda. You process change log events to enable use cases such as: - Database replication - Stream processing applications - Streaming ETL pipelines - Update caches - Event-driven Microservices --- # Page 22: Set Up Postgres CDC with Debezium and Redpanda **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/cdc-postgres-json.md --- # Set Up Postgres CDC with Debezium and Redpanda --- title: Set Up Postgres CDC with Debezium and Redpanda latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: cdc-postgres-json page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: cdc-postgres-json.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/cdc-postgres-json.adoc description: Use Debezium to capture the changes made to a Postgres database in real time and stream them to Redpanda. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This example demonstrates using Debezium to capture the changes made to Postgres in real time and stream them to Redpanda. This ready-to-run `docker-compose` setup contains the following containers: - `postgres` container with the `pandashop` database, containing a single table, `orders` - `debezium` container capturing changes made to the `orders` table in real time. - `redpanda` container to ingest change data streams produced by `debezium` For more information about `pandashop` schema, see the `/data/postgres_bootstrap.sql` file. ![Example architecture](../_images/postgres-architecture.png) ## [](#prerequisites)Prerequisites You must have [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `docker-compose/cdc/postgres-json/` directory: ```bash cd redpanda-labs/docker-compose/cdc/postgres-json ``` 3. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Run the following in the directory where you saved the Docker Compose file: ```bash docker compose up -d ``` When the `postgres` container starts, the `/data/postgres_bootstrap.sql` file creates the `pandashop` database and the `orders` table, followed by seeding the \` orders\` table with a few records. 5. Log into Postgres: ```sql docker compose exec postgres psql -U postgresuser -d pandashop ``` 6. Check the content inside the `orders` table: ```sql select * from orders; ``` This is the source table. 7. While Debezium is up and running, create a source connector configuration to extract change data feeds from Postgres: ```bash docker compose exec debezium curl -H 'Content-Type: application/json' debezium:8083/connectors --data ' { "name": "postgres-connector", "config": { "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "plugin.name": "pgoutput", "database.hostname": "postgres", "database.port": "5432", "database.user": "postgresuser", "database.password": "postgrespw", "database.dbname" : "pandashop", "database.server.name": "postgres", "table.include.list": "public.orders", "topic.prefix" : "dbz" } }' ``` Notice the `database.*` configurations specifying the connectivity details to `postgres` container. Wait a minute or two until the connector gets deployed inside Debezium and creates the initial snapshot of change log topics in Redpanda. 8. Check the list of change log topics in Redpanda: ```bash docker compose exec redpanda rpk topic list ``` The output should contain two topics with the prefix `dbz.*` specified in the connector configuration. The topic `dbz.public.orders` holds the initial snapshot of change log events streamed from `orders` table. NAME PARTITIONS REPLICAS connect-status 5 1 connect\_configs 1 1 connect\_offsets 25 1 dbz.public.orders 1 1 9. Monitor for change events by consuming the `dbz.public.orders` topic: ```bash docker compose exec redpanda rpk topic consume dbz.public.orders ``` 10. While the consumer is running, open another terminal to insert a record to the `orders` table: ```bash export REDPANDA_VERSION=v26.1.3 docker compose exec postgres psql -U postgresuser -d pandashop ``` 11. Insert the following record: ```sql INSERT INTO orders (customer_id, total) values (5, 500); ``` This will trigger a change event in Debezium, immediately publishing it to `dbz.public.orders` Redpanda topic, causing the consumer to display a new event in the console. That proves the end to end functionality of your CDC pipeline. ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` ## [](#next-steps)Next steps Now that you have change log events ingested into Redpanda. You process change log events to enable use cases such as: - Database replication - Stream processing applications - Streaming ETL pipelines - Update caches - Event-driven Microservices --- # Page 23: Disaster Recovery with Envoy and Shadowing **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/envoy-shadowing.md --- # Disaster Recovery with Envoy and Shadowing --- title: Disaster Recovery with Envoy and Shadowing latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: envoy-shadowing page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: envoy-shadowing.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/envoy-shadowing.adoc description: Combine Redpanda Shadowing for data replication with Envoy proxy for transparent client routing during disaster recovery. page-topic-type: lab personas: platform_operator, streaming_developer learning-objective-1: Set up Shadowing for offset-preserving data replication learning-objective-2: Configure Envoy for automatic client routing during failover learning-objective-3: Execute a complete disaster recovery failover page-git-created-date: "2026-01-20" page-git-modified-date: "2026-01-20" --- This lab demonstrates a disaster recovery setup that combines [Redpanda Shadowing](../../../current/manage/disaster-recovery/shadowing/) with [Envoy proxy](https://www.envoyproxy.io/). - **Shadowing** provides offset-preserving, byte-for-byte data replication between clusters - **Envoy** provides transparent client routing without requiring client reconfiguration [Envoy](https://www.envoyproxy.io/) is a high-performance proxy that can route traffic intelligently based on backend health. In this setup, Envoy routes Kafka clients to the active cluster and automatically fails over to the shadow cluster when the source becomes unavailable. This eliminates the need to reconfigure clients during disaster recovery. In this lab, you will: - Set up Shadowing for offset-preserving data replication - Configure Envoy for automatic client routing during failover - Execute a complete disaster recovery failover ## [](#prerequisites)Prerequisites You need [Docker and Docker Compose](https://docs.docker.com/compose/install/). ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git cd redpanda-labs/docker-compose/envoy-shadowing ``` 2. Start the environment: ```bash docker compose up -d --wait ``` 3. Verify both clusters are healthy: ```bash docker exec redpanda-source rpk cluster health docker exec redpanda-shadow rpk cluster health ``` 4. Create a topic on the source cluster: ```bash docker exec redpanda-source rpk topic create demo-topic --partitions 3 --replicas 1 ``` 5. Create a shadow link to replicate data from source to shadow: ```bash docker exec redpanda-shadow rpk shadow create \ --config-file /config/shadow-link.yaml \ --no-confirm \ -X admin.hosts=redpanda-shadow:9644 ``` 6. Verify the shadow link is active: ```bash docker exec redpanda-shadow rpk shadow status demo-shadow-link -X admin.hosts=redpanda-shadow:9644 ``` 7. Produce messages through Envoy (routes to source cluster): ```bash docker exec python-client python3 /scripts/test-producer.py ``` 8. Verify data replicated to shadow (lag should be 0): ```bash docker exec redpanda-shadow rpk shadow status demo-shadow-link -X admin.hosts=redpanda-shadow:9644 | grep -A5 "demo-topic" ``` ## [](#simulate-disaster-and-failover)Simulate disaster and failover 1. Stop the source cluster to simulate a disaster: ```bash docker stop redpanda-source ``` Envoy detects the failure in 10-15 seconds and routes traffic to the shadow cluster. 2. Read replicated data from shadow through Envoy: ```bash docker exec python-client python3 /scripts/test-consumer.py ``` Consumers can read from shadow topics immediately after Envoy fails over. 3. Execute shadow failover to enable writes: ```bash docker exec redpanda-shadow rpk shadow failover demo-shadow-link --all --no-confirm \ -X admin.hosts=redpanda-shadow:9644 ``` Shadow topics are read-only until you run the failover command. This prevents split-brain scenarios where both clusters accept writes. 4. Produce new messages to the failed-over shadow cluster: ```bash docker exec python-client python3 /scripts/test-producer.py ``` ## [](#clean-up)Clean up Stop and remove the demo environment: ```bash docker compose down -v ``` ## [](#what-you-explored)What you explored In this lab, you: - Set up Shadowing between source and shadow clusters with offset-preserving replication - Configured Envoy for automatic client routing based on cluster health - Simulated a disaster by stopping the source cluster - Verified consumers can read replicated data through Envoy immediately after failover - Executed `rpk shadow failover` to enable writes on the shadow cluster - Produced new messages to the failed-over cluster without client reconfiguration The following table summarizes the roles of each component in this disaster recovery setup: | Component | Role | Automatic? | | --- | --- | --- | | Shadowing | Data replication with preserved offsets | Yes | | Envoy | Client routing to healthy cluster | Yes | | rpk shadow failover | Enable writes on shadow topics | No (manual) | ## [](#suggested-reading)Suggested reading - [Shadowing for Disaster Recovery](../../../current/manage/disaster-recovery/shadowing/) - [Envoy Kafka Broker Filter](https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/network_filters/kafka_broker_filter) --- # Page 24: Redpanda Iceberg Docker Compose Example **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/iceberg.md --- # Redpanda Iceberg Docker Compose Example --- title: Redpanda Iceberg Docker Compose Example latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: iceberg page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: iceberg.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/iceberg.adoc description: Pair Redpanda with MinIO for Tiered Storage and write data in the Iceberg format to enable seamless analytics workflows on data in Redpanda topics. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This lab provides a Docker Compose environment to help you quickly get started with Redpanda and its integration with Apache Iceberg. It showcases how Redpanda, when paired with a Tiered Storage solution like MinIO, can write data in the Iceberg format, enabling seamless analytics workflows. The lab also includes a Spark environment configured for querying the Iceberg tables using SQL within a Jupyter Notebook interface. In this setup, you will: - Produce data to Redpanda topics that are Iceberg-enabled. - Observe how Redpanda writes this data in Iceberg format to MinIO as the Tiered Storage backend. - Use Spark to query the Iceberg tables, demonstrating a complete pipeline from data production to querying. This environment is ideal for experimenting with Redpanda’s Iceberg and Tiered Storage capabilities, enabling you to test end-to-end workflows for analytics and data lake architectures. ## [](#prerequisites)Prerequisites You must have the following installed on your machine: - [Docker and Docker Compose](https://docs.docker.com/compose/install/) - [`rpk`](https://docs.redpanda.com/current/get-started/rpk-install/) This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `docker-compose/iceberg/` directory: ```bash cd redpanda-labs/docker-compose/iceberg ``` 3. Set the `REDPANDA_VERSION` environment variable to at least version 24.3.1. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Start the Docker Compose environment, which includes Redpanda, MinIO, Spark, and Jupyter Notebook: ```bash docker compose build && docker compose up ``` The build process may take a few minutes to complete, as it builds the Spark image with the necessary dependencies for Iceberg. 6. Create and switch to a new `rpk` profile that connects to your Redpanda broker: ```bash rpk profile create docker-compose-iceberg --set=admin_api.addresses=localhost:19644 --set=brokers=localhost:19092 --set=schema_registry.addresses=localhost:18081 ``` 7. Create two topics with Iceberg enabled: ```bash rpk topic create key_value --topic-config=redpanda.iceberg.mode=key_value rpk topic create value_schema_id_prefix --topic-config=redpanda.iceberg.mode=value_schema_id_prefix ``` 8. Produce data to the `key_value` topic and see data show up. ```bash echo "hello world" | rpk topic produce key_value --format='%k %v\n' ``` 9. Open Redpanda Console at [http://localhost:8081/topics](http://localhost:8081/topics) to see that the topics exist in Redpanda. 10. Open MinIO at [http://localhost:9001/browser](http://localhost:9001/browser) to view your data stored in the S3-compatible object store. Login credentials: - Username: `minio` - Password: `minio123` 11. Open the Jupyter Notebook server at [http://localhost:8888](http://localhost:8888). The notebook guides you through querying Iceberg tables created from Redpanda topics. Complete the next two steps first before running the code in the notebook. 12. Create a schema in the Schema Registry: ```bash rpk registry schema create value_schema_id_prefix-value --schema schema.avsc ``` 13. Produce data to the `value_schema_id_prefix` topic: ```bash echo '{"user_id":2324,"event_type":"BUTTON_CLICK","ts":"2024-11-25T20:23:59.380Z"}\n{"user_id":3333,"event_type":"SCROLL","ts":"2024-11-25T20:24:14.774Z"}\n{"user_id":7272,"event_type":"BUTTON_CLICK","ts":"2024-11-25T20:24:34.552Z"}' | rpk topic produce value_schema_id_prefix --format='%v\n' --schema-id=topic ``` When the data is committed, it should be available in Iceberg format and you can query the table `lab.redpanda.value_schema_id_prefix` in the [Jupyter Notebook](http://localhost:8888). ## [](#alternative-query-interfaces)Alternative query interfaces While the notebook server is running, you can query Iceberg tables directly using Spark’s CLI tools, Instead of Jupyter Notebook: Spark Shell ```bash docker exec -it spark-iceberg spark-shell ``` Spark SQL ```bash docker exec -it spark-iceberg spark-sql ``` PySpark ```bash docker exec -it spark-iceberg pyspark ``` ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 25: Stream Jira Issues to Redpanda for Real-Time Metrics **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/jira-metrics-pipeline.md --- # Stream Jira Issues to Redpanda for Real-Time Metrics --- title: Stream Jira Issues to Redpanda for Real-Time Metrics latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: jira-metrics-pipeline page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: jira-metrics-pipeline.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/jira-metrics-pipeline.adoc description: Build a real-time Jira metrics pipeline using Redpanda Connect and Redpanda to track development performance, SLA compliance, and team productivity. page-git-created-date: "2025-11-21" page-git-modified-date: "2025-11-21" --- This lab demonstrates a production-ready pipeline that streams Jira issues to Redpanda topics in real-time. The pipeline transforms raw Jira API responses into a normalized, consumer-friendly schema and routes issues to different topics, enabling use cases SLA monitoring and team performance analytics. ## [](#architecture)Architecture Redpanda Connect periodically queries the Jira REST API for recently updated issues in your project. Each issue is transformed to extract key fields, flatten nested objects, and compute flags like `is_high_priority` and `is_completed`. Based on these flags, issues are routed to different Redpanda topics: - `jira.issues.all`: All issues - `jira.issues.high-priority`: Issues with Highest/High priority - `jira.issues.completed`: Issues marked as Done/Closed/Resolved ## [](#use-cases)Use cases ### [](#sla-monitoring-and-alerting)SLA monitoring and alerting - Automatically detect stale high-priority issues. - Send alerts when issues haven’t been updated in a certain number of days. - Track response times and prevent SLA breaches. - Route critical issues to on-call teams. ### [](#team-performance-analytics)Team performance analytics - Calculate team velocity and throughput. - Identify bottlenecks in the development process. - Track individual contributor metrics. - Generate sprint reports automatically. ### [](#compliance-and-audit-trail)Compliance and audit trail - Maintain a complete history of issue changes. - Immutable event log in Redpanda. - Audit who changed what and when. ## [](#prerequisites)Prerequisites You must have the following installed on your host machine: - [Docker and Docker Compose](https://docs.docker.com/compose/install/) - [rpk](https://docs.redpanda.com/current/get-started/rpk-install/) (Redpanda CLI) - required to generate the Enterprise license This lab also requires: - **Jira instance** with API access - **Jira API token** from [Atlassian](https://id.atlassian.com/manage-profile/security/api-tokens) This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. Clone the repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `docker-compose/jira-metrics-pipeline/` directory: ```bash cd redpanda-labs/docker-compose/jira-metrics-pipeline ``` 3. Copy the example environment file: ```bash cp .env.example .env ``` 4. Edit the `.env` file and configure: ```bash # Versions (optional - defaults are provided) REDPANDA_VERSION=v26.1.3 REDPANDA_CONSOLE_VERSION=v3.7.1 REDPANDA_CONNECT_VERSION=4.70.0 # Jira credentials (required) JIRA_BASE_URL=https://.atlassian.net JIRA_USERNAME= JIRA_API_TOKEN= JIRA_PROJECT= ``` 5. Generate a Redpanda Enterprise trial license: ```bash rpk generate license \ --name "" \ --last-name "" \ --email "" \ --company "" ``` This creates a 30-day trial license in `./redpanda.license`. 6. Export the license as a shell environment variable: ```bash export REDPANDA_LICENSE=$(cat ./redpanda.license) ``` > 📝 **NOTE** > > The `REDPANDA_LICENSE` must be exported in your shell. It cannot be loaded from the `.env` file. The `--env-file` flag loads application variables (like `JIRA_*`), but the license must be exported separately. 7. Start the Docker containers: ```bash docker compose up -d ``` 8. Create the Kafka topics: ```bash docker compose exec redpanda rpk topic create jira.issues.all -p 3 docker compose exec redpanda rpk topic create jira.issues.high-priority -p 3 docker compose exec redpanda rpk topic create jira.issues.completed -p 3 ``` 9. Open Redpanda Console at [localhost:8080](http://localhost:8080) to view your topics and messages. 10. Monitor the Jira issues streaming into Redpanda: ```bash # Watch all issues docker compose exec redpanda rpk topic consume jira.issues.all --format json # Watch high-priority issues docker compose exec redpanda rpk topic consume jira.issues.high-priority --format json # Watch completed issues docker compose exec redpanda rpk topic consume jira.issues.completed --format json ``` ## [](#how-it-works)How it works The pipeline performs the following steps: 1. **Query Jira**: Every 30 seconds, Redpanda Connect queries Jira for recently updated issues in your project 2. **Transform data**: Each Jira issue is transformed into a normalized schema: - Flattened fields: status, priority, issue type (extracted from nested Jira objects) - Null-safe people fields: assignee and reporter (defaults to "Unassigned"/"Unknown") - Raw timestamps: created, updated, resolved (preserved in original Jira format) - Extracted component names from component objects - Computed boolean flags: `is_completed`, `is_high_priority` - Generated issue URL: `{JIRA_BASE_URL}/browse/{key}` - Pipeline processing timestamp 3. **Route messages**: Issues are intelligently routed to topics based on computed flags: - High priority (Highest/High) → `jira.issues.high-priority` - Completed (Done/Closed/Resolved) → `jira.issues.completed` - All others → `jira.issues.all` 4. **Consume and analyze**: Multiple downstream applications can consume from these topics independently: - Metrics dashboards (Grafana, Kibana) - Alert systems (Slack, PagerDuty) - Data warehouses (Snowflake, BigQuery) - Workflow automation ## [](#pipeline-configuration)Pipeline configuration The complete Redpanda Connect pipeline configuration: ```yaml # JIRA Metrics Pipeline # # This pipeline queries JIRA issues and streams them to Redpanda topics # for real-time metrics, alerting, and analytics. # # Environment variables required: # - REDPANDA_LICENSE: Enterprise license (must be exported in shell) # - JIRA_BASE_URL: https://your-domain.atlassian.net # - JIRA_USERNAME: your-email@example.com # - JIRA_API_TOKEN: your-api-token # - JIRA_PROJECT: YOUR_PROJECT_KEY # - REDPANDA_BROKERS: redpanda:9092 (set in docker-compose.yml) input: generate: mapping: | # Query JIRA for recent issues (last 7 days for testing) root.jql = "project = ${JIRA_PROJECT} AND updated >= -7d ORDER BY updated DESC" root.maxResults = 10 root.fields = [ "key", "summary", "status", "priority", "assignee", "reporter", "created", "updated", "resolutiondate", "issuetype", "labels", "components" ] # Query every 30 seconds interval: 30s pipeline: processors: # Execute JIRA query - jira: base_url: "${JIRA_BASE_URL}" username: "${JIRA_USERNAME}" api_token: "${JIRA_API_TOKEN}" max_results_per_page: 100 request_timeout: 30s max_retries: 10 # Transform and enrich each issue with metrics - mapping: | # Basic issue info root.issue_key = this.key root.summary = this.fields.summary root.status = this.fields.status.name root.priority = this.fields.priority.name root.issue_type = this.fields.issuetype.name root.url = "${JIRA_BASE_URL}/browse/" + this.key # People root.assignee = if this.fields.assignee != null { this.fields.assignee.displayName } else { "Unassigned" } root.reporter = if this.fields.reporter != null { this.fields.reporter.displayName } else { "Unknown" } # Dates (keep original format for downstream processing) root.created = this.fields.created root.updated = this.fields.updated root.resolved = this.fields.resolutiondate # Labels and components root.labels = this.fields.labels root.components = this.fields.components.map_each(c -> c.name) # Flags for routing root.is_completed = this.fields.status.name.lowercase().contains("done") || this.fields.status.name.lowercase().contains("closed") || this.fields.status.name.lowercase().contains("resolved") root.is_high_priority = ["Highest", "High"].contains(this.fields.priority.name) # Note: Timestamp-based metrics (age, staleness, lead time) can be calculated # by downstream consumers using the raw `created`, `updated`, and `resolved` fields. # Add pipeline processing timestamp root.pipeline_timestamp = now() # Route to primary topic based on issue properties - mapping: | # Route based on priority and completion status meta kafka_topic = if this.is_high_priority { "jira.issues.high-priority" } else if this.is_completed { "jira.issues.completed" } else { "jira.issues.all" } output: kafka: addresses: ["${REDPANDA_BROKERS}"] topic: '${! meta("kafka_topic") }' max_in_flight: 1 batching: count: 100 period: 1s compression: snappy ``` ## [](#customize-the-pipeline)Customize the pipeline ### [](#adjust-polling-frequency)Adjust polling frequency Edit `connect-configs/jira-pipeline.yaml`: ```yaml input: generate: interval: 60s # Change from 30s to 1m, 5m, etc. ``` ### [](#modify-jql-query)Modify JQL query Target different issues by changing the JQL query: ```yaml root.jql = "project = YOUR_PROJECT AND status != Closed AND updated >= -1h" ``` ### [](#add-custom-fields)Add custom fields Include Jira custom fields in your pipeline: ```yaml root.fields = [ "key", "summary", "customfield_10001", # Add your custom field ID # ... other fields ] ``` ### [](#add-more-routing-rules)Add more routing rules Route issues to additional topics based on labels, components, or other criteria: ```yaml - mapping: | meta kafka_topic = if this.labels.contains("security") { "jira.issues.security" } else { deleted() } ``` ## [](#view-logs-and-metrics)View logs and metrics ### [](#redpanda-connect-logs)Redpanda Connect logs ```bash docker compose logs -f connect ``` ### [](#redpanda-logs)Redpanda logs ```bash docker compose logs -f redpanda ``` ### [](#redpanda-connect-metrics)Redpanda Connect metrics ```bash curl http://localhost:4195/metrics ``` ## [](#troubleshooting)Troubleshooting ### [](#license-errors)License errors If you see license-related errors in the Connect logs: 1. Verify the license is exported: ```bash echo $REDPANDA_LICENSE ``` 2. If the environment variable is empty, regenerate and export the license: ```bash rpk generate license --name "Your Name" \ --last-name "Last" \ --email "you@example.com" \ --company "Company" export REDPANDA_LICENSE=$(cat ./redpanda.license) ``` 3. Restart the containers: ```bash docker compose restart connect ``` ### [](#jira-authentication-errors)Jira authentication errors If you receive 401 Unauthorized errors: 1. Verify your API token is correct. 2. Ensure you’re using your email address as the username. 3. Check that your Jira instance URL is correct (include `https://`). 4. Test your credentials with curl: ```bash curl -u "${JIRA_USERNAME}:${JIRA_API_TOKEN}" \ "${JIRA_BASE_URL}/rest/api/3/myself" ``` ### [](#no-issues-appearing)No issues appearing If no issues are being published: 1. Check the JQL query returns results in Jira directly. 2. Verify the `JIRA_PROJECT` environment variable matches your project key. 3. Check Connect logs for errors: ```bash docker compose logs connect ``` 4. Adjust the query timeframe: ```yaml root.jql = "project = YOUR_PROJECT AND updated >= -24h" # Last 24 hours ``` ### [](#rate-limiting)Rate limiting If you hit Jira rate limits (HTTP 429): 1. Increase the polling interval in `connect-configs/jira-pipeline.yaml`: ```yaml interval: 60s # Reduce frequency to 60 seconds ``` 2. Increase retry settings: ```yaml - jira: max_retries: 20 request_timeout: 60s ``` 3. Use more specific JQL queries to reduce result sizes. ## [](#clean-up)Clean up To shut down and delete the containers along with all cluster data: ```bash docker compose down -v ``` ## [](#next-steps)Next steps After you have Jira issues streaming into Redpanda, you can extend this pipeline with Redpanda Connect outputs: ### [](#send-alerts-and-notifications)Send alerts and notifications - **Slack**: Post issue updates to channels using the [`slack_post` output](../../../redpanda-connect/components/outputs/slack_post/) - **Discord**: Send notifications with the [`discord` output](../../../redpanda-connect/components/outputs/discord/) - **HTTP/Webhooks**: Trigger PagerDuty, Opsgenie, or custom webhooks using the [`http_client` output](../../../redpanda-connect/components/outputs/http_client/) ### [](#stream-to-data-warehouses)Stream to data warehouses - **Snowflake**: Load issues into Snowflake with the [`snowflake_streaming` output](../../../redpanda-connect/components/outputs/snowflake_streaming/) - **BigQuery**: Stream to Google BigQuery using the [`gcp_bigquery` output](../../../redpanda-connect/components/outputs/gcp_bigquery/) - **PostgreSQL/MySQL**: Store in relational databases with [`sql_insert`](../../../redpanda-connect/components/outputs/sql_insert/) or [`sql_raw` outputs](../../../redpanda-connect/components/outputs/sql_raw/) ### [](#calculate-metrics)Calculate metrics Use the raw timestamp fields (`created`, `updated`, `resolved`) to calculate: - **Lead time**: Average time from creation to completion - **Cycle time**: Average time from "In Progress" to "Done" - **Throughput**: Issues completed per time period - **Aging**: Distribution of issue ages - **SLA compliance**: Percentage of issues resolved within target timeframes ## [](#suggested-reading)Suggested reading - [Jira Processor Reference](../../../redpanda-connect/components/processors/jira/) - [Jira REST API Documentation](https://developer.atlassian.com/cloud/jira/platform/rest/v3/intro/) - [JQL Query Guide](https://www.atlassian.com/software/jira/guides/jql) - [Redpanda Quickstart](../../../current/get-started/quick-start/) --- # Page 26: Enable Unified Identity with Azure Entra ID for Redpanda and Redpanda Console **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/oidc.md --- # Enable Unified Identity with Azure Entra ID for Redpanda and Redpanda Console --- title: Enable Unified Identity with Azure Entra ID for Redpanda and Redpanda Console latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: oidc page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: oidc.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/oidc.adoc description: Integrate Azure Entra ID with Redpanda and Redpanda Console for unified identity using OpenID Connect (OIDC). page-git-created-date: "2025-05-23" page-git-modified-date: "2025-05-23" --- This lab walks you through integrating Azure Entra ID (formerly Azure AD) with Redpanda and Redpanda Console for unified identity using OpenID Connect (OIDC). This integration allows you to use Azure Entra ID for authentication in Redpanda and Redpanda Console. Redpanda does not support mapping Entra ID groups or other token claims to Redpanda roles. As a result, authorization decisions (who can do what) are not made by Entra ID but must be configured directly in Redpanda using ACLs or role bindings. Azure Entra ID authenticates the user, but Redpanda enforces access. ## [](#prerequisites)Prerequisites - Access to [Azure Portal](https://portal.azure.com) - [Azure Entra ID tenant](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-create-new-tenant) - [Install `rpk`](../../../current/get-started/rpk-install/) on your host machine. - [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#set-up-azure-entra-id)Set up Azure Entra ID This section guides you through creating an Azure Entra ID application registration and configuring it for use with Redpanda. You’ll set up the necessary permissions, scopes, and claims to ensure proper authentication and authorization. ### [](#create-an-application-registration)Create an application registration In this section, you’ll register a new application in Entra ID that Redpanda will use to authenticate. 1. Sign in to [Azure Portal](https://portal.azure.com). 2. Navigate to **Azure Active Directory** > **App registrations** > **New registration**. 3. Configure: - **Name**: `RedpandaUnifiedIdentity` - **Supported account types**: Choose **Accounts in this organizational directory only** - **Redirect URI**: `[http://localhost:8080/auth/callbacks/oidc](http://localhost:8080/auth/callbacks/oidc)` 4. Click **Register**. See also: [Register an application](https://learn.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app). ### [](#set-access-token-version-to-v2)Set access token version to v2 Redpanda and Redpanda Console require that your identity provider (IdP) issues **JWT-encoded access tokens** that can be cryptographically verified. To ensure this, the IdP must support OpenID Connect (OIDC), not just OAuth 2.0. By default, OAuth 2.0 allows for opaque tokens, which Redpanda does not support. OIDC extends OAuth 2.0 by specifying how tokens should be structured, signed, and validated using a discovery document and public keys. In Azure Entra ID, you must configure the application to issue v2.0 **JWT access tokens**: 1. In the App Registration page, go to **Manage** > **Manifest**. 2. Set `accessTokenAcceptedVersion` to `2`: ```json "accessTokenAcceptedVersion": 2 ``` 3. Click **Save**. > 💡 **TIP** > > Azure Entra ID does not include claims like `email` in access tokens by default. You can add them by going to **Token Configuration** > **\+ Add optional claim**, then selecting **email** for **Access token** type. ### [](#define-a-custom-scope)Define a custom scope Scopes are required to explicitly request JWT access tokens. 1. Go to **Expose an API** > **Set Application ID URI** (accept the default if prompted). 2. Click **Add a scope** and enter: - **Scope name**: `entraid.v2-access-tokens` - **Who can consent**: Admins and users - **Display name**: `Entra ID v2 access tokens` - **Description**: `Allows Redpanda to request v2 access tokens` 3. Click **Add scope**. See also: [Expose an API](https://learn.microsoft.com/en-us/azure/active-directory/develop/quickstart-configure-app-expose-web-apis). ### [](#create-a-client-secret)Create a client secret This secret is used by Redpanda Console to authenticate with Entra ID. 1. Go to **Certificates & secrets** > **New client secret**. 2. Add a description and choose an expiration (12-24 months recommended). 3. Save the generated secret securely. ### [](#gather-configuration-values)Gather configuration values Collect these values to configure Redpanda and Redpanda Console later. | Key | Location | Example | | --- | --- | --- | | Client ID | App Registration Overview | ecc74380-7c64-4283-9fa1-03a37b9054b7 | | Tenant ID | App Registration Overview | 9a95fd9e-005d-487a-9a01-d08c1eab2757 | | Client Secret | Certificates & Secrets | GENERATED_SECRET_VALUE | | Issuer URL | Manual: use Tenant ID | https://login.microsoftonline.com//v2.0 | ### [](#verify-token-configuration)Verify the token configuration This step confirms you receive a token with expected claims. 1. Run a test OAuth client credentials flow: ```bash curl -X POST "https://login.microsoftonline.com//oauth2/v2.0/token" \ -d "client_id=" \ -d "scope=api:///.default" \ -d "client_secret=" \ -d "grant_type=client_credentials" ``` 2. Decode the returned token at [jwt.io](https://jwt.io) and verify: - `iss`: matches the issuer URL - `aud`: matches your client ID - `ver`: is `2.0` - `sub`: is present 3. Save the `sub` claim value for later use. ### [](#application-registration-final-checklist)Application registration final checklist - `accessTokenAcceptedVersion` set to `2` - Custom scope `entraid.v2-access-tokens` created ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `docker-compose/oidc/` directory: ```bash cd redpanda-labs/docker-compose/oidc ``` 3. Set the `REDPANDA_VERSION` environment variable to a supported version of Redpanda. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 4. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use Redpanda Console version v3.1.0 or later to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 5. Create a `.env` file and add the following environment variables: `.env` ```bash TENANT_ID= OIDC_CLIENT_ID= OIDC_CLIENT_SECRET= JWT_SIGNING_KEY= ``` Redpanda Console expects configuration to be passed as environment variables, where each YAML key is converted to uppercase and underscores are added for each level of nesting. For example, `authentication.oidc.clientSecret` becomes `AUTHENTICATION_OIDC_CLIENTSECRET`. To simplify this, user-friendly environment variables, such as `OIDC_CLIENT_SECRET`, are set here and then mapped to the required format in the `docker-compose.yml` file. This pattern helps separate sensitive or dynamic values from implementation-specific keys. See [Configure Redpanda Console](../../../current/console/config/configure-console/#environment-variables) for details. 6. Start Redpanda and Redpanda Console in Docker by running the following command: ```bash docker compose up --detach --wait ``` 7. Set up your `rpk` profile on your local machine: ```bash rpk profile create oidc-lab --from-profile profile.yaml ``` 8. Update the Redpanda cluster configuration. Redpanda does not support environment variables for cluster-level properties. To configure OIDC authentication, use the following `rpk` CLI commands: ```bash # Define at least one superuser. This is required for managing ACLs and cluster config. # Add your OIDC user’s `sub` claim here to grant superuser privileges. rpk cluster config set superusers '["superuser", ""]' # Enable SASL-based authentication on the Kafka API. rpk cluster config set enable_sasl true # Require authentication on the Admin API. Without this, the Admin API remains open. rpk cluster config set admin_api_require_auth true # Enable supported SASL mechanisms for the Kafka API. # SCRAM is for password-based users. OAUTHBEARER is for OIDC tokens. rpk cluster config set sasl_mechanisms '["SCRAM", "OAUTHBEARER"]' # Allow both BASIC (SCRAM) and OIDC authentication for HTTP APIs. rpk cluster config set http_authentication '["BASIC", "OIDC"]' # Set the expected audience (aud claim) for validating OIDC tokens. rpk cluster config set oidc_token_audience "" # Configure the OIDC discovery URL (typically ends in /.well-known/openid-configuration). rpk cluster config set oidc_discovery_url "https://login.microsoftonline.com//v2.0/.well-known/openid-configuration" # Map the OIDC user identity from the token’s `sub` claim (default). rpk cluster config set oidc_principal_mapping "$.sub" # Allow for some clock drift between Redpanda and the IdP. rpk cluster config set oidc_clock_skew_tolerance 600 # Optional: Enable automatic topic creation on first access. rpk cluster config set auto_create_topics_enabled true ``` Replace all placeholder values with the values you gathered from the Azure Entra ID app registration: - ``: Tenant ID - ``: Client ID - ``: Sub value from the token 9. Enable authentication on HTTP Proxy and Schema Registry. To ensure secure access to the HTTP-based APIs (HTTP Proxy and Schema Registry), you must enable authentication on those listeners by setting `authentication_method: http_basic`. 1. Open a terminal and run: ```bash docker exec -it redpanda-0 bash ``` 2. Open the redpanda.yaml config file: ```bash nano /etc/redpanda/redpanda.yaml ``` 3. Add `authentication_method: http_basic` to the HTTP Proxy (`pandaproxy`) and Schema Registry listeners to enable authentication for those API endpoints. For example: ```yaml pandaproxy: pandaproxy_api: - address: 0.0.0.0 port: 8082 name: internal authentication_method: http_basic - address: 0.0.0.0 port: 18082 name: external authentication_method: http_basic advertised_pandaproxy_api: - address: redpanda-0 port: 8082 name: internal authentication_method: http_basic - address: localhost port: 18082 name: external authentication_method: http_basic schema_registry: schema_registry_api: - address: 0.0.0.0 port: 8081 name: internal authentication_method: http_basic - address: 0.0.0.0 port: 18081 name: external authentication_method: http_basic ``` 4. Save the file and exit the editor. 5. Exit the container: ```bash exit ``` 6. Restart the Redpanda broker: ```bash docker restart redpanda-0 ``` 7. Repeat this process for each broker (`redpanda-1` and `redpanda-2`). 10. Open Redpanda Console in your browser at [http://localhost:8080/login](http://localhost:8080/login). 11. Click **Log in with OIDC** and enter the login details of your OIDC user. You should be redirected to the Redpanda Console dashboard. You are now logged in to Redpanda Console using OIDC authentication with Azure Entra ID. You can now use Redpanda Console to manage your Redpanda cluster and monitor its performance. ## [](#connect-a-kafka-client-with-oidc)Connect a Kafka client with OIDC You can test OIDC client authentication by connecting a Kafka client, such as KafkaJS, to Redpanda. KafkaJS supports OIDC through the `oauthBearerProvider` configuration. This function must return a valid access token that the client uses to authenticate with Redpanda through the SASL/OAUTHBEARER mechanism. The provided `client.js` script demonstrates this by: - Authenticating to Azure Entra ID using the **client credentials flow**. - Retrieving a JWT access token. - Using that token to produce a message to Redpanda. When using the **client credentials flow**, Azure Entra ID issues tokens for **applications**, not users. These tokens do not include user-specific claims like `email` or `preferred_username`. Instead, the token includes the `sub` claim (subject), which uniquely identifies the application. 1. Find the sub value by running the client script: ```bash npm install && node client.js get-sub-value ``` Look for the `sub` field, which will look like a UUID: ```json "sub": "ae775e64-5853-42cb-b62a-e092c7c5288b" ``` 2. Use `rpk` to assign a role to the application’s `sub` value: ```bash rpk security acl create \ --allow-principal User: \ --operation write,read,describe,create \ --topic test-topic ``` 3. Run the client script again to start producing messages to the `test` topic: ```bash node client.js ``` Output: \[Kafka\] Connecting producer... \[OIDC\] Fetching new access token... \[Kafka\] Sending message... \[Kafka\] Message sent successfully. \[Kafka\] Producer disconnected. You should now see the message produced to the `test` topic in your Redpanda cluster. 4. Consume the message from the `test` topic: ```bash rpk topic consume test-topic --num 1 ``` Example output: ```json { "topic": "test-topic", "value": "Hello from OIDC client!", "timestamp": 1746623458222, "partition": 0, "offset": 0 } ``` ## [](#token-refresh)Token refresh The `client.js` script implements automatic token refreshing using `simple-oauth2`. When the token is about to expire, it fetches a new one in the background, keeping the Kafka client authenticated seamlessly. For long-running apps, this pattern avoids token expiry errors and ensures smooth reconnections. ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` ## [](#troubleshoot-oidc-login)Troubleshoot OIDC login If you encounter issues logging in to Redpanda Console, check the following in your Redpanda Console configuration: - Ensure the `issuerUrl` matches the issuer URL from the application registration. - Verify the `clientId` and `clientSecret` match those from the application registration. - Check the `redirectUrl` matches the redirect URI set in the application registration. - Ensure the `additionalScopes` includes the custom scope you created in the application registration. - Verify the `oidc_principal_mapping` matches the claim you want to use for user mapping. - Check the `oidc_token_audience` in your Redpanda configuration matches the client ID from the application registration. ## [](#oidc-limitations)OIDC limitations - Redpanda requires JWT-formatted access tokens (not ID tokens) for Kafka API authentication using SASL/OAUTHBEARER. Access tokens issued by some IdPs, such as Google, are opaque and not supported. - The `rpk` CLI does not support OIDC login. - Redpanda requires OIDC principals to be set as superusers to access the Admin API. Granular authorization is not supported. - The `rpk` CLI does not support the SASL/OAUTHBEARER mechanism for deploying data transforms. Use SASL/SCRAM instead. ## [](#suggested-reading)Suggested reading - [Authentication in Redpanda Console](../../../current/console/config/security/authentication/) - [KafkaJS OIDC example](https://kafka.js.org/docs/configuration#oauthbearer-example) - [Microsoft: ID Tokens](https://learn.microsoft.com/en-us/entra/identity-platform/id-token-claims#optional-claims) - [Microsoft: Expose an API](https://learn.microsoft.com/en-us/azure/active-directory/develop/quickstart-configure-app-expose-web-apis) - [Auth0: Client Credentials Flow](https://auth0.com/docs/get-started/authentication-and-authorization-flow/client-credentials-flow) --- # Page 27: Owl Shop Example Application in Docker **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/owl-shop.md --- # Owl Shop Example Application in Docker --- title: Owl Shop Example Application in Docker latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: owl-shop page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: owl-shop.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/owl-shop.adoc description: Manage and monitor applications in Redpanda Console using data from an example e-commerce application called owl shop. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This Docker Compose example starts a single Redpanda broker, Redpanda Console, and an example application called _owl shop_. Owl shop simulates a simple e-commerce shop that uses Redpanda as an asynchronous communication exchange. You can use the sample data to see how to manage and monitor applications in Redpanda Console. Owl shop creates topics, produces sample data to those topics, and consumes from those topics so that you can test Redpanda Console with some. ## [](#prerequisites)Prerequisites You must have [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. [Download](../_attachments/owl-shop/docker-compose.yml) the following Docker Compose file on your local file system. > 📝 **NOTE** > > If you are running on an ARM-based device such as the Apple M1 chip, uncomment the `platform: 'linux/amd64'` lines. Reveal the Docker Compose file `docker-compose.yml` ```yaml name: redpanda-owl-shop networks: redpanda_network: driver: bridge volumes: redpanda: null services: redpanda: image: docker.redpanda.com/redpandadata/redpanda:v26.1.3 command: - redpanda start - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:19092 # Address the broker advertises to clients that connect to the Kafka API. # Use the internal addresses to connect to the Redpanda brokers # from inside the same Docker network. # Use the external addresses to connect to the Redpanda brokers # from outside the Docker network. - --advertise-kafka-addr internal://redpanda:9092,external://localhost:19092 - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:18082 # Address the broker advertises to clients that connect to the HTTP Proxy. - --advertise-pandaproxy-addr internal://redpanda:8082,external://localhost:18082 - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:18081 # Redpanda brokers use the RPC API to communicate with each other internally. - --rpc-addr redpanda:33145 - --advertise-rpc-addr redpanda:33145 # Mode dev-container uses well-known configuration properties for development in containers. - --mode dev-container # Tells Seastar (the framework Redpanda uses under the hood) to use 1 core on the system. - --smp 1 ports: - 18081:18081 - 18082:18082 - 19092:19092 - 19644:9644 volumes: - redpanda:/var/lib/redpanda/data networks: - redpanda_network healthcheck: test: ["CMD-SHELL", "rpk cluster health | grep -E 'Healthy:.+true' || exit 1"] interval: 15s timeout: 3s retries: 5 start_period: 5s console: image: docker.redpanda.com/redpandadata/console:v3.7.1 entrypoint: /bin/sh command: -c "echo \"$$CONSOLE_CONFIG_FILE\" > /tmp/config.yml; /app/console" environment: CONFIG_FILEPATH: /tmp/config.yml CONSOLE_CONFIG_FILE: | kafka: brokers: ["redpanda:9092"] schemaRegistry: enabled: true urls: ["http://redpanda:8081"] redpanda: adminApi: enabled: true urls: ["http://redpanda:9644"] kafkaConnect: enabled: true clusters: - name: local-connect-cluster url: http://connect:8083 ports: - 8080:8080 networks: - redpanda_network depends_on: - redpanda owl-shop: image: quay.io/cloudhut/owl-shop:sha-042112b networks: - redpanda_network #platform: 'linux/amd64' entrypoint: /bin/sh command: -c "echo \"$$OWLSHOP_CONFIG_FILE\" > /tmp/config.yml; /app/owlshop" environment: CONFIG_FILEPATH: /tmp/config.yml OWLSHOP_CONFIG_FILE: | shop: requestRate: 1 interval: 0.1s topicReplicationFactor: 1 topicPartitionCount: 1 kafka: brokers: "redpanda:9092" depends_on: - redpanda - connect connect: image: docker.redpanda.com/redpandadata/connectors:v1.0.23 hostname: connect container_name: connect networks: - redpanda_network #platform: 'linux/amd64' depends_on: - redpanda ports: - "8083:8083" environment: CONNECT_CONFIGURATION: | key.converter=org.apache.kafka.connect.converters.ByteArrayConverter value.converter=org.apache.kafka.connect.converters.ByteArrayConverter group.id=connectors-cluster offset.storage.topic=_internal_connectors_offsets config.storage.topic=_internal_connectors_configs status.storage.topic=_internal_connectors_status config.storage.replication.factor=-1 offset.storage.replication.factor=-1 status.storage.replication.factor=-1 offset.flush.interval.ms=1000 producer.linger.ms=50 producer.batch.size=131072 CONNECT_BOOTSTRAP_SERVERS: redpanda:9092 CONNECT_GC_LOG_ENABLED: "false" CONNECT_HEAP_OPTS: -Xms512M -Xmx512M CONNECT_LOG_LEVEL: info ``` 2. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 3. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 4. Run the following in the directory where you saved the Docker Compose file: ```bash docker compose up -d ``` 5. Open Redpanda Console at [localhost:8080](http://localhost:8080) and go to **Topics** to see the owl shop topics. ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 28: Migrate Data with Redpanda Migrator **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/redpanda-migrator.md --- # Migrate Data with Redpanda Migrator --- title: Migrate Data with Redpanda Migrator latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: redpanda-migrator page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: redpanda-migrator.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/redpanda-migrator.adoc description: Migrate data, schemas, and consumer offsets from a source Kafka cluster to a target Redpanda cluster using Redpanda Migrator. page-topic-type: lab personas: streaming_developer, platform_operator, evaluator learning-objective-1: Run Redpanda Migrator for continuous data replication between clusters learning-objective-2: Explore SASL authentication and ACL-based authorization for secure migration learning-objective-3: Observe topic and schema migration with preserved configurations page-git-created-date: "2026-01-12" page-git-modified-date: "2026-01-12" --- This lab demonstrates how to use [Redpanda Migrator](../../../redpanda-connect/components/inputs/redpanda_migrator/) to migrate data from a source Kafka-compatible cluster to a target Redpanda cluster. Redpanda Migrator is built on Redpanda Connect and provides continuous data replication with automatic topic creation, schema migration, and consumer offset synchronization. ## [](#what-youll-explore)What you’ll explore In this lab, you will: - Run Redpanda Migrator for continuous data replication between clusters - Explore SASL authentication and ACL-based authorization for secure migration - Observe topic and schema migration with preserved configurations ## [](#architecture)Architecture This lab creates a complete migration environment: - **Source cluster**: Simulates a Confluent Kafka cluster with SASL authentication, demo topics, and registered schemas - **Target cluster**: Redpanda cluster with matching security configuration - **Redpanda Migrator**: Continuously replicates data, schemas, and offsets from source to target - **Console instances**: Web UIs for both clusters to observe the migration in real-time The migrator automatically creates topics with matching partition counts and configurations, preserves schema IDs and versions, and migrates consumer offsets. ## [](#prerequisites)Prerequisites - [Docker and Docker Compose](https://docs.docker.com/compose/install/) - At least 4GB RAM available - [GNU Make](https://www.gnu.org/software/make/) (usually pre-installed on Linux/macOS) This lab is for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. Clone this repository: ```bash git clone https://github.com/redpanda-data/redpanda-labs.git ``` 2. Change into the `docker-compose/redpanda-migrator-demo/` directory: ```bash cd redpanda-labs/docker-compose/redpanda-migrator-demo ``` 3. (Optional) Set container versions. The `docker-compose.yml` file uses default versions. To use specific versions, set these environment variables: ```bash export REDPANDA_VERSION=26.1.3 export REDPANDA_CONSOLE_VERSION=3.7.1 export REDPANDA_CONNECT_VERSION=4.87.0 ``` For all available Redpanda versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). 4. Start the demo environment: ```bash make start ``` This command starts the source and target Redpanda clusters, Redpanda Console instances, and the migrator service. See `docker-compose.yml` for port mappings and service details. 5. Set up users, ACLs, topics, and schemas: ```bash make setup ``` This creates users with SASL authentication, configures ACLs for secure migration, creates demo topics with various cleanup policies, and registers schemas. See `scripts/setup.sh` for detailed ACL permissions and topic configurations. 6. (Optional) Verify the ACL configuration: ```bash make verify-acls ``` This confirms that `migrator-user` has correct read-only source access and read-write target access with proper restrictions. See `scripts/verify-acls.sh` for test details. 7. Start continuous data production: ```bash make demo-start ``` This starts a background producer that sends approximately 80 messages every 2 seconds across all demo topics. 8. (Optional) View the migration in Redpanda Console: Open your browser to observe the migration in real-time: - Source cluster: [http://localhost:8080](http://localhost:8080) - Target cluster: [http://localhost:8081](http://localhost:8081) You can see topics, messages, and schemas being replicated from source to target. 9. (Optional) Verify continuous schema syncing: ```bash make verify-continuous-schema ``` This command registers a new schema after the migrator starts and confirms it automatically appears in the target Schema Registry within 15 seconds. 10. Monitor migration lag: ```bash make monitor-lag ``` This command displays a real-time dashboard showing lag per topic, message counts, and production status. Watch as the lag decreases to 0, indicating the migrator has caught up with the source cluster. Let it run for 30-60 seconds to observe continuous replication, then press Ctrl+C to exit. 11. Verify data consistency: Run the verification script to confirm successful migration: ```bash make verify ``` This command checks that all topics, partitions, messages, cleanup policies, and schemas have been migrated correctly. 12. Stop continuous production: ```bash make demo-stop ``` 13. Disable Schema Registry import mode: > ❗ **IMPORTANT** > > This step is required to return the target Schema Registry to normal operations. The target Schema Registry must be in import mode for schema migration to work (enabled automatically by `make setup`). After migration completes, disable import mode: ```bash curl -X PUT http://localhost:28081/mode \ -H "Content-Type: application/json" \ -u 'admin-user:admin-secret-password' \ -d '{"mode":"READWRITE"}' ``` Import mode allows the migrator to register schemas with preserved IDs and versions. When migration is done, return to readwrite mode for normal operations. ## [](#configuration)Configuration The migrator uses the principle of least privilege with non-superuser accounts. For detailed security configuration including users, ACLs, and permissions, see: - `scripts/setup.sh`: User creation and ACL configuration - `config/migrator-config.yaml`: Migrator settings and Schema Registry sync - `docker-compose.yml`: Service ports and environment variables This lab uses single-node clusters and demo credentials for simplicity. For production migrations, use multi-node clusters with strong, randomly generated passwords stored in a secrets manager such as HashiCorp Vault or AWS Secrets Manager. ## [](#troubleshoot)Troubleshoot ### [](#authentication-errors)Authentication errors If you see "ILLEGAL\_SASL\_STATE" errors, run `make setup` to configure SASL authentication. ### [](#topics-not-appearing-in-target)Topics not appearing in target Check that the migrator is running: ```bash docker compose ps migrator curl http://localhost:4195/ping ``` Check the migrator logs: ```bash docker compose logs migrator ``` ### [](#schemas-not-migrating)Schemas not migrating If you see "Subject is not in import mode" errors, enable import mode: ```bash curl -X PUT http://localhost:28081/mode \ -H "Content-Type: application/json" \ -u 'admin-user:admin-secret-password' \ -d '{"mode":"IMPORT"}' ``` Then restart the migrator: ```bash docker compose restart migrator ``` ## [](#clean-up)Clean up To stop and remove all containers and data: ```bash make clean ``` ## [](#next-steps)Next steps After running this demo: - Read the [Redpanda Migrator documentation](../../../redpanda-connect/cookbooks/redpanda_migrator/) - Learn about [Schema Registry](../../../current/manage/schema-reg/schema-reg-overview/) configuration - Explore [Redpanda authentication](../../../current/manage/security/authentication/) options --- # Page 29: Start a Single Redpanda Broker with Redpanda Console in Docker **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/single-broker.md --- # Start a Single Redpanda Broker with Redpanda Console in Docker --- title: Start a Single Redpanda Broker with Redpanda Console in Docker latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: single-broker page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: single-broker.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/single-broker.adoc description: Start a single Redpanda broker and Redpanda Console to start developing your application on Redpanda locally. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This Docker Compose example starts a single Redpanda broker and Redpanda Console. This Docker Compose file is also used in the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/). ## [](#prerequisites)Prerequisites You must have [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. [Download](../_attachments/single-broker/docker-compose.yml) the following Docker Compose file on your local file system. Reveal the Docker Compose file `docker-compose.yml` ```yaml name: redpanda-quickstart-one-broker networks: redpanda_network: driver: bridge volumes: redpanda-0: null services: redpanda-0: command: - redpanda - start - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:19092 # Address the broker advertises to clients that connect to the Kafka API. # Use the internal addresses to connect to the Redpanda brokers' # from inside the same Docker network. # Use the external addresses to connect to the Redpanda brokers' # from outside the Docker network. - --advertise-kafka-addr internal://redpanda-0:9092,external://localhost:19092 - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:18082 # Address the broker advertises to clients that connect to the HTTP Proxy. - --advertise-pandaproxy-addr internal://redpanda-0:8082,external://localhost:18082 - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:18081 # Redpanda brokers use the RPC API to communicate with each other internally. - --rpc-addr redpanda-0:33145 - --advertise-rpc-addr redpanda-0:33145 # Mode dev-container uses well-known configuration properties for development in containers. - --mode dev-container # Tells Seastar (the framework Redpanda uses under the hood) to use 1 core on the system. - --smp 1 - --default-log-level=info image: docker.redpanda.com/redpandadata/redpanda:v26.1.3 container_name: redpanda-0 volumes: - redpanda-0:/var/lib/redpanda/data networks: - redpanda_network ports: - 18081:18081 - 18082:18082 - 19092:19092 - 19644:9644 console: container_name: redpanda-console image: docker.redpanda.com/redpandadata/console:v3.7.1 networks: - redpanda_network entrypoint: /bin/sh command: -c 'echo "$$CONSOLE_CONFIG_FILE" > /tmp/config.yml; /app/console' environment: CONFIG_FILEPATH: /tmp/config.yml CONSOLE_CONFIG_FILE: | kafka: brokers: ["redpanda-0:9092"] schemaRegistry: enabled: true urls: ["http://redpanda-0:8081"] redpanda: adminApi: enabled: true urls: ["http://redpanda-0:9644"] ports: - 8080:8080 depends_on: - redpanda-0 ``` 2. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 3. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 4. Run the following in the directory where you saved the Docker Compose file: ```bash docker compose up -d ``` 5. Open Redpanda Console at [localhost:8080](http://localhost:8080). ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 30: Start a Cluster of Redpanda Brokers with Redpanda Console in Docker **URL**: https://docs.redpanda.com/redpanda-labs/docker-compose/three-brokers.md --- # Start a Cluster of Redpanda Brokers with Redpanda Console in Docker --- title: Start a Cluster of Redpanda Brokers with Redpanda Console in Docker latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: three-brokers page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: three-brokers.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/three-brokers.adoc description: Start three Redpanda brokers and Redpanda Console to start developing your application on Redpanda locally. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- This Docker Compose example starts three Redpanda brokers and Redpanda Console. This Docker Compose file is also used in the [Redpanda Quickstart](https://docs.redpanda.com/current/get-started/quick-start/). ## [](#prerequisites)Prerequisites You must have [Docker and Docker Compose](https://docs.docker.com/compose/install/) installed on your host machine. This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab. ## [](#run-the-lab)Run the lab 1. [Download](../_attachments/three-brokers/docker-compose.yml) the following Docker Compose file on your local file system. Reveal the Docker Compose file `docker-compose.yml` ```yaml networks: redpanda_network: driver: bridge volumes: redpanda-0: null redpanda-1: null redpanda-2: null services: redpanda-0: command: - redpanda - start - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:19092 # Address the broker advertises to clients that connect to the Kafka API. # Use the internal addresses to connect to the Redpanda brokers' # from inside the same Docker network. # Use the external addresses to connect to the Redpanda brokers' # from outside the Docker network. - --advertise-kafka-addr internal://redpanda-0:9092,external://localhost:19092 - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:18082 # Address the broker advertises to clients that connect to the HTTP Proxy. - --advertise-pandaproxy-addr internal://redpanda-0:8082,external://localhost:18082 - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:18081 # Redpanda brokers use the RPC API to communicate with each other internally. - --rpc-addr redpanda-0:33145 - --advertise-rpc-addr redpanda-0:33145 # Mode dev-container uses well-known configuration properties for development in containers. - --mode dev-container # Tells Seastar (the framework Redpanda uses under the hood) to use 1 core on the system. - --smp 1 - --default-log-level=info image: docker.redpanda.com/redpandadata/redpanda:v26.1.3 container_name: redpanda-0 volumes: - redpanda-0:/var/lib/redpanda/data networks: - redpanda_network ports: - 18081:18081 - 18082:18082 - 19092:19092 - 19644:9644 redpanda-1: command: - redpanda - start - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:29092 - --advertise-kafka-addr internal://redpanda-1:9092,external://localhost:29092 - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:28082 - --advertise-pandaproxy-addr internal://redpanda-1:8082,external://localhost:28082 - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:28081 - --rpc-addr redpanda-1:33145 - --advertise-rpc-addr redpanda-1:33145 - --mode dev-container - --smp 1 - --default-log-level=info - --seeds redpanda-0:33145 image: docker.redpanda.com/redpandadata/redpanda:v26.1.3 container_name: redpanda-1 volumes: - redpanda-1:/var/lib/redpanda/data networks: - redpanda_network ports: - 28081:28081 - 28082:28082 - 29092:29092 - 29644:9644 depends_on: - redpanda-0 redpanda-2: command: - redpanda - start - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:39092 - --advertise-kafka-addr internal://redpanda-2:9092,external://localhost:39092 - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:38082 - --advertise-pandaproxy-addr internal://redpanda-2:8082,external://localhost:38082 - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:38081 - --rpc-addr redpanda-2:33145 - --advertise-rpc-addr redpanda-2:33145 - --mode dev-container - --smp 1 - --default-log-level=info - --seeds redpanda-0:33145 image: docker.redpanda.com/redpandadata/redpanda:v26.1.3 container_name: redpanda-2 volumes: - redpanda-2:/var/lib/redpanda/data networks: - redpanda_network ports: - 38081:38081 - 38082:38082 - 39092:39092 - 39644:9644 depends_on: - redpanda-0 console: container_name: redpanda-console image: docker.redpanda.com/redpandadata/console:v3.7.1 networks: - redpanda_network entrypoint: /bin/sh command: -c 'echo "$$CONSOLE_CONFIG_FILE" > /tmp/config.yml; /app/console' environment: CONFIG_FILEPATH: /tmp/config.yml CONSOLE_CONFIG_FILE: | kafka: brokers: ["redpanda-0:9092"] schemaRegistry: enabled: true urls: ["http://redpanda-0:8081"] redpanda: adminApi: enabled: true urls: ["http://redpanda-0:9644"] ports: - 8080:8080 depends_on: - redpanda-0 ``` 2. Set the `REDPANDA_VERSION` environment variable to the version of Redpanda that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). For example: ```bash export REDPANDA_VERSION=v26.1.3 ``` 3. Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases). > 📝 **NOTE** > > You must use at least version v3.0.0 of Redpanda Console to deploy this lab. For example: ```bash export REDPANDA_CONSOLE_VERSION=v3.7.1 ``` 4. Run the following in the directory where you saved the Docker Compose file: ```bash docker compose up -d ``` 5. Open Redpanda Console at [localhost:8080](http://localhost:8080). ## [](#clean-up)Clean up To shut down and delete the containers along with all your cluster data: ```bash docker compose down -v ``` --- # Page 31: Set Up GitOps for the Redpanda Helm Chart **URL**: https://docs.redpanda.com/redpanda-labs/kubernetes/gitops-helm.md --- # Set Up GitOps for the Redpanda Helm Chart --- title: Set Up GitOps for the Redpanda Helm Chart latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: gitops-helm page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: gitops-helm.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/kubernetes/pages/gitops-helm.adoc description: Use Flux to deploy the Redpanda Helm chart on a local Kubernetes cluster. page-git-created-date: "2025-05-06" page-git-modified-date: "2025-05-06" --- GitOps is a modern approach to managing and automating the deployment and provisioning process using Git as the single source of truth. It involves storing configuration files and deployment scripts in a Git repository, and then using automation tools to continuously monitor the repository for changes. This example uses Flux to deploy the Redpanda Helm chart on a local Kubernetes cluster. Flux is a toolkit for GitOps with Kubernetes clusters that supports the following: - **Version control for configurations**: You can track changes, collaborate, and revert to previous configurations if needed. - **Drift detection and remediation**: Flux continuously monitors the Redpanda cluster’s state. If discrepancies are detected, Flux automatically remediates them to bring the Redpanda cluster back to the desired state. - **Collaboration and auditing**: Multiple team members can propose changes to Redpanda configurations through Git pull requests, enabling code reviews and discussions before changes are applied. ## [](#prerequisites)Prerequisites You must have the following: - [A GitHub account](https://github.com/signup). - [The Flux CLI](https://fluxcd.io/flux/installation/#install-the-flux-cli) - An understanding of the [core concepts of Flux](https://fluxcd.io/flux/concepts/). - At least version 1.24 of [the kubectl CLI](https://kubernetes.io/docs/tasks/tools/). ```bash kubectl version --client ``` - At least version 3.6.0 of [Helm](https://helm.sh/docs/intro/install/). ```bash helm version ``` - [kind](https://kind.sigs.k8s.io/docs/user/quick-start/#installation) - [Docker](https://docs.docker.com/get-docker/) ## [](#create-a-local-kubernetes-cluster)Create a local Kubernetes cluster Create one master and three worker nodes (one worker node for each Redpanda broker). 1. Define a cluster in the `kind.yaml` configuration file: ```bash cat <kind.yaml --- apiVersion: kind.x-k8s.io/v1alpha4 kind: Cluster nodes: - role: control-plane - role: worker - role: worker - role: worker EOF ``` 2. Create the Kubernetes cluster from the configuration file: ```bash kind create cluster --config kind.yaml ``` ## [](#run-the-lab)Run the lab Fork this repository, and configure Flux to connect to your fork and deploy the Redpanda Helm chart. 1. Fork the [`redpanda-data/redpanda-labs`](https://github.com/redpanda-data/redpanda-labs) repository on GitHub. 2. Bootstrap Flux for your forked repository. > 📝 **NOTE** > > Make sure to do the following: > > - Provide Flux with your [GitHub personal access token (PAT)](https://fluxcd.io/flux/installation/bootstrap/github/#github-pat). > > - Configure the `path` flag with the value `kubernetes/gitops-helm`. This is the path where the example manifests are stored in the repository. Here is an example of the bootstrap command: ```bash flux bootstrap github \ --token-auth \ --owner= \ --repository=redpanda-labs \ --branch=main \ --path=./kubernetes/gitops-helm \ --personal ``` Replace `` with your GitHub username. The bootstrap script does the following: 1. Creates a deploy token and saves it as a Kubernetes Secret 2. Creates an empty GitHub project, if the project specified by `--repository` doesn’t exist 3. Generates Flux definition files for your project 4. Commits the definition files to the specified branch 5. Applies the definition files to your cluster 6. Applies the manifests in `kubernetes/gitops-helm` which deploy Redpanda and cert-manager After you run the script, Flux is ready to manage itself and any other resources you add to the GitHub project at the specified path. ## [](#verify-the-deployment)Verify the deployment To verify that the deployment was successful, check the status of the HelmRelease resource: ```bash kubectl get helmrelease redpanda --namespace redpanda --watch ``` In a few minutes, you should see that the Helm install succeeded: NAME AGE READY STATUS redpanda 3m23s True Helm install succeeded for release redpanda/redpanda.v1 with chart redpanda@5.7.5 ## [](#manage-updates)Manage updates To update Redpanda, modify the `redpanda-helm-release.yaml` manifest in your Git repository. You can configure the Helm chart in the `spec.values` field. For a description of all available configurations, see the [Redpanda Helm Chart Specification](https://docs.redpanda.com/current/reference/k-redpanda-helm-spec/). When you push changes to GitHub, Flux automatically applies the updates to your Kubernetes cluster. ## [](#delete-the-cluster)Delete the cluster To delete the Kubernetes cluster as well as all the Docker resources that kind created, run: ```bash kind delete cluster ``` ## [](#suggested-reading)Suggested reading See the [interactive examples](https://play.instruqt.com/redpanda/invite/l2huksol8qhv) for setting up GitOps with the Redpanda Operator. --- # Page 32: Iceberg Streaming on Kubernetes with Redpanda, MinIO, and Spark **URL**: https://docs.redpanda.com/redpanda-labs/kubernetes/iceberg.md --- # Iceberg Streaming on Kubernetes with Redpanda, MinIO, and Spark --- title: Iceberg Streaming on Kubernetes with Redpanda, MinIO, and Spark latest-operator-version: v26.1.2 latest-console-tag: v3.7.1 latest-connect-version: 4.87.0 latest-redpanda-tag: v26.1.3 docname: iceberg page-component-name: redpanda-labs page-version: master page-component-version: master page-component-title: Labs page-relative-src-path: iceberg.adoc page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/kubernetes/pages/iceberg.adoc description: Pair Redpanda with MinIO for Tiered Storage and write data in the Iceberg format to enable seamless analytics workflows on data in Redpanda topics. page-git-created-date: "2025-07-18" page-git-modified-date: "2025-07-18" --- This lab provides a local Kubernetes environment to help you quickly get started with Redpanda and its integration with Apache Iceberg. It showcases how Redpanda, when paired with a Tiered Storage solution like MinIO, can write data in the Iceberg format, enabling seamless analytics workflows. The lab also includes a Spark environment configured for querying the Iceberg tables using SQL within a Jupyter Notebook interface. In this setup, you will: - Produce data to Redpanda topics that are Iceberg-enabled. - Observe how Redpanda writes this data in Iceberg format to MinIO as the Tiered Storage backend. - Use Spark to query the Iceberg tables, demonstrating a complete pipeline from data production to querying. This environment is ideal for experimenting with Redpanda’s Iceberg and Tiered Storage capabilities, enabling you to test end-to-end workflows for analytics and data lake architectures. ## [](#prerequisites)Prerequisites ### [](#system-requirements)System requirements Make sure you have the following system requirements: - **Operating System**: macOS, Linux, or Windows with WSL2 - **CPU**: Minimum 4 cores - **Memory**: Minimum 8 GB RAM - **Storage**: Minimum 20 GB free disk space for Docker images and data - **Docker**: At least 6 GB memory allocation for Docker Desktop ### [](#required-tools)Required tools Install the following tools: - [Docker](https://docs.docker.com/get-docker/) (with sufficient memory allocation) - [kind](https://kind.sigs.k8s.io/docs/user/quick-start/) (Kubernetes in Docker) - [kubectl](https://kubernetes.io/docs/tasks/tools/) (Kubernetes CLI) - [Helm](https://helm.sh/docs/intro/install/) (Kubernetes package manager) ## [](#run-the-lab)Run the lab Follow these steps to set up the lab environment: ### [](#clone-the-repository)Clone the repository ```bash git clone https://github.com/redpanda-data/redpanda-labs.git cd redpanda-labs/kubernetes/iceberg ``` ### [](#create-the-kubernetes-cluster-and-namespace)Create the Kubernetes cluster and namespace ```bash kind create cluster --config kind.yaml kubectl create namespace iceberg-lab ``` ### [](#configure-redpanda-for-iceberg)Configure Redpanda for Iceberg Configuration for Redpanda is provided in `configmap.yaml` and `secret.yaml`. This sets up the necessary configurations for Iceberg topics and Tiered Storage. ```bash kubectl apply -f configmap.yaml --namespace iceberg-lab kubectl apply -f secret.yaml --namespace iceberg-lab ``` ### [](#deploy-minio)Deploy MinIO 1. Deploy the MinIO Operator: ```bash kubectl apply -k "github.com/minio/operator?ref=v5.0.18" ``` 2. Wait for the MinIO Operator to be ready: ```bash kubectl wait --for=condition=available deployment --all --namespace minio-operator --timeout=120s ``` 3. Create a MinIO instance: ```bash helm repo add minio-operator https://operator.min.io helm repo update helm upgrade --install iceberg-minio minio-operator/tenant \ --namespace iceberg-lab \ --version 7.1.1 \ --values minio-tenant-values.yaml ``` 4. Wait for the MinIO instance to be ready: ```bash sleep 10 && kubectl wait --for=condition=ready pod -l v1.min.io/tenant=iceberg-minio --namespace iceberg-lab --timeout=300s ``` If you see `error: no matching resources found`, it means the MinIO Operator has not yet created the Pods. Wait a few moments and try the command again. ### [](#set-up-minio-bucket-and-permissions)Set up MinIO bucket and permissions This step creates a job that sets up the MinIO bucket and permissions required for Redpanda to access the object store. ```bash kubectl apply -f minio-setup-job.yaml --namespace iceberg-lab kubectl wait --for=condition=complete job/minio-setup --namespace iceberg-lab --timeout=60s ``` If the job times out, check the logs with `kubectl logs job/minio-setup --namespace iceberg-lab iceberg-lab`. If you see `400 Bad Request` errors, MinIO might have reverted to TLS mode. See the [troubleshooting section](#troubleshoot). ### [](#deploy-the-iceberg-rest-catalog)Deploy the Iceberg REST catalog This step deploys the Iceberg REST catalog, which allows you to interact with Iceberg tables over HTTP. It uses an init container approach to automatically resolve DNS issues with bucket-style S3 URLs. ```bash kubectl apply -f iceberg-rest.yaml kubectl wait --for=condition=available deployment/iceberg-rest --namespace iceberg-lab --timeout=120s ``` ### [](#prepare-spark)Prepare Spark 1. Label and taint your dedicated Spark worker node: ```bash kubectl label node kind-worker node-role.kubernetes.io/spark-node=true kubectl taint nodes kind-worker dedicated=spark:NoSchedule ``` 2. Build and load the Spark Docker image: ```bash docker build -t spark-iceberg-jupyter:latest ./spark kind load docker-image spark-iceberg-jupyter:latest --name kind --nodes kind-worker ``` This step builds the Spark image with Kubernetes-specific configurations for Iceberg and uploads it to the kind cluster. The Dockerfile automatically detects your system architecture (ARM64 or x86\_64) and downloads the appropriate dependencies. Wait for the build to complete and the image to be loaded into the kind cluster. 3. Verify the image is loaded: ```bash docker exec -it kind-worker crictl images | grep spark-iceberg-jupyter ``` You should see output similar to the following: docker.io/library/spark-iceberg-jupyter latest 86f20b1213dd3 3.83GB 4. Deploy Spark: ```bash kubectl apply -f spark.yaml kubectl wait --for=condition=available deployment/spark-iceberg --namespace iceberg-lab --timeout=120s ``` ### [](#deploy-redpanda)Deploy Redpanda 1. Install the Redpanda Operator custom resource definitions (CRDs): ```bash kubectl kustomize "https://github.com/redpanda-data/redpanda-operator//operator/config/crd?ref=v2.4.4" \ | kubectl apply --server-side -f - ``` 2. Deploy the Redpanda Operator: ```bash helm repo add jetstack https://charts.jetstack.io helm repo add redpanda https://charts.redpanda.com helm repo update helm install cert-manager jetstack/cert-manager \ --set crds.enabled=true \ --namespace cert-manager \ --create-namespace \ --version 1.17.4 helm upgrade --install redpanda-controller redpanda/operator \ --namespace iceberg-lab \ --create-namespace \ --version v2.4.4 ``` 3. Ensure that the Deployment is successfully rolled out: ```bash kubectl --namespace iceberg-lab rollout status --watch deployment/redpanda-controller-operator ``` When you see `deployment "redpanda-controller-operator" successfully rolled out`, the operator is ready. 4. Create the Redpanda cluster: ```bash kubectl apply -f redpanda.yaml --namespace iceberg-lab ``` 5. Wait for the Redpanda cluster to be ready: ```bash kubectl get redpanda --namespace iceberg-lab --watch ``` When the Redpanda cluster is ready, the output should look similar to the following: NAME READY STATUS redpanda True Redpanda reconciliation succeeded ### [](#expose-services)Expose services In this step, you set up access to the MinIO UI, Spark Jupyter Notebook, and Redpanda Console. #### [](#set-up-minio-console-access)Set up MinIO console access For reliable access to the MinIO console, create a NodePort service: ```bash kubectl apply -f minio-nodeport.yaml --namespace iceberg-lab ``` The NodePort service exposes MinIO console on port 32090 of all cluster nodes. In a kind cluster, you can access it directly at: [http://localhost:32090](http://localhost:32090) > 📝 **NOTE** > > This approach avoids the known port-forwarding issues with MinIO console (see [issue #2539](https://github.com/minio/object-browser/issues/2539)). The MinIO Console UI requires websockets which don’t work reliably through `kubectl port-forward` tunnels. NodePort provides direct access without websocket connectivity issues. #### [](#set-up-port-forwarding-for-other-services)Set up port forwarding for other services For Spark Jupyter Notebook and Redpanda Console, use port forwarding: ```bash kubectl port-forward deploy/spark-iceberg 8888:8888 --namespace iceberg-lab & kubectl port-forward svc/redpanda-console 8080:8080 --namespace iceberg-lab & ``` You can run these commands in separate terminals, or run them in the background by appending `&` as shown above. This way, all port-forward processes will run in the background in the same terminal. You can bring them to the foreground with `fg` or stop them with `kill` if needed. ## [](#create-and-validate-iceberg-topics)Create and validate Iceberg topics You can validate your setup by performing the following steps: 1. Alias the Redpanda CLI: ```bash alias internal-rpk="kubectl --namespace iceberg-lab exec -i -t redpanda-0 -c redpanda -- rpk" ``` This command allows you to run `rpk` commands directly against the Redpanda broker in the `iceberg-lab` namespace using the `internal-rpk` alias. You can also use `kubectl exec -i -t redpanda-0 -c redpanda — rpk` directly if you prefer not to set an alias. 2. Create Iceberg topics: ```bash internal-rpk topic create key_value --topic-config=redpanda.iceberg.mode=key_value ``` 3. Produce sample data: ```bash echo "hello world" | internal-rpk topic produce key_value --format='%k %v\n' ``` 4. Open Redpanda Console at [http://localhost:8080/topics](http://localhost:8080/topics) to see that the topics exist in Redpanda. 5. Open MinIO at [http://localhost:32090](http://localhost:32090) to view your data stored in the S3-compatible object store. Login credentials: - Username: minio - Password: minio123 6. Open the Jupyter Notebook server at [http://localhost:8888](http://localhost:8888). The notebook guides you through querying the Iceberg table created from your Redpanda topic. ## [](#clean-up)Clean up When you’re finished with the lab, you can clean up the resources: 1. Stop all port forwarding processes: ```bash pkill -f "kubectl port-forward" ``` You can also use Ctrl+C if the port forwarding is running in the foreground. 2. Delete the MinIO NodePort service: ```bash kubectl delete service minio-nodeport -n iceberg-lab ``` 3. Delete the kind cluster (this removes everything): ```bash kind delete cluster ``` Or, if you want to keep the cluster but remove just the lab resources: ```bash # Delete the namespace (removes all lab resources) kubectl delete namespace iceberg-lab # Delete the MinIO operator kubectl delete -k "github.com/minio/operator?ref=v5.0.18" # Delete cert-manager helm uninstall cert-manager --namespace cert-manager kubectl delete namespace cert-manager ``` ## [](#troubleshoot)Troubleshoot ### [](#redpanda-bucket-access-errors)Redpanda bucket access errors If Redpanda logs show `bucket not found` errors after setup: 1. Verify the bucket exists: ```bash kubectl exec -n iceberg-lab iceberg-minio-pool-0-0 -c minio -- mc ls local/ ``` 2. Check that Redpanda can reach MinIO: ```bash kubectl exec -n iceberg-lab redpanda-0 -c redpanda -- curl -I http://iceberg-minio-hl.iceberg-lab.svc.cluster.local:9000/redpanda ``` ### [](#iceberg-rest-catalog-500-errors)Iceberg REST catalog 500 errors If the Iceberg REST catalog shows `UnknownHostException` errors in the logs: 1. Check the catalog logs for DNS resolution errors: ```bash kubectl logs -n iceberg-lab deployment/iceberg-rest | grep -i "unknownhost\|resolve" ``` 2. If you see errors, check the init container logs to see if DNS resolution failed: ```bash kubectl logs -n iceberg-lab deployment/iceberg-rest -c dns-resolver ``` 3. The init container automatically resolves MinIO’s IP and configures DNS mappings. If MinIO pods restart and get new IPs, restart the Iceberg REST catalog: ```bash kubectl rollout restart deployment/iceberg-rest -n iceberg-lab kubectl rollout status deployment/iceberg-rest -n iceberg-lab ``` ## [](#suggested-reading)Suggested reading - [MinIO Kubernetes Operator installation](https://min.io/docs/minio/kubernetes/upstream/operations/installation.html) - [Deploy MinIO tenant with Helm](https://min.io/docs/minio/kubernetes/upstream/operations/install-deploy-manage/deploy-minio-tenant-helm.html#deploy-tenant-helm) - [Iceberg Topics in Redpanda](../../../current/manage/iceberg/about-iceberg-topics/) ---