# Redpanda Iceberg Docker Compose Example

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [labs-full.txt](https://docs.redpanda.com/labs-full.txt)

---
title: Redpanda Iceberg Docker Compose Example
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: iceberg
page-component-name: labs
page-version: master
page-component-version: master
page-component-title: Labs
page-relative-src-path: iceberg.adoc
page-edit-url: https://github.com/redpanda-data/redpanda-labs/edit/main/docs/modules/docker-compose/pages/iceberg.adoc
description: Pair Redpanda with MinIO for Tiered Storage and write data in the Iceberg format to enable seamless analytics workflows on data in Redpanda topics.
page-git-created-date: "2025-05-06"
page-git-modified-date: "2025-05-06"
---

<!-- Source: https://docs.redpanda.com/labs/docker-compose/iceberg.md -->

This lab provides a Docker Compose environment to help you quickly get started with Redpanda and its integration with Apache Iceberg. It showcases how Redpanda, when paired with a Tiered Storage solution like MinIO, can write data in the Iceberg format, enabling seamless analytics workflows. The lab also includes a Spark environment configured for querying the Iceberg tables using SQL within a Jupyter Notebook interface.

In this setup, you will:

-   Produce data to Redpanda topics that are Iceberg-enabled.

-   Observe how Redpanda writes this data in Iceberg format to MinIO as the Tiered Storage backend.

-   Use Spark to query the Iceberg tables, demonstrating a complete pipeline from data production to querying.


This environment is ideal for experimenting with Redpanda’s Iceberg and Tiered Storage capabilities, enabling you to test end-to-end workflows for analytics and data lake architectures.

## [](#prerequisites)Prerequisites

You must have the following installed on your machine:

-   [Docker and Docker Compose](https://docs.docker.com/compose/install/)

-   [`rpk`](https://docs.redpanda.com/current/get-started/rpk-install/)


This lab is intended for Linux and macOS users. If you are using Windows, you must use the Windows Subsystem for Linux (WSL) to run the commands in this lab.

## [](#run-the-lab)Run the lab

1.  Clone this repository:

    ```bash
    git clone https://github.com/redpanda-data/redpanda-labs.git
    ```

2.  Change into the `docker-compose/iceberg/` directory:

    ```bash
    cd redpanda-labs/docker-compose/iceberg
    ```

3.  Set the `REDPANDA_VERSION` environment variable to at least version 24.3.1. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases).

    For example:

    ```bash
    export REDPANDA_VERSION=v26.1.9
    ```

4.  Set the `REDPANDA_CONSOLE_VERSION` environment variable to the version of Redpanda Console that you want to run. For all available versions, see the [GitHub releases](https://github.com/redpanda-data/redpanda/releases).

    > 📝 **NOTE**
    >
    > You must use at least version v3.0.0 of Redpanda Console to deploy this lab.

    For example:

    ```bash
    export REDPANDA_CONSOLE_VERSION=v3.7.3
    ```

5.  Start the Docker Compose environment, which includes Redpanda, MinIO, Spark, and Jupyter Notebook:

    ```bash
    docker compose build && docker compose up
    ```

    The build process may take a few minutes to complete, as it builds the Spark image with the necessary dependencies for Iceberg.

6.  Create and switch to a new `rpk` profile that connects to your Redpanda broker:

    ```bash
    rpk profile create docker-compose-iceberg --set=admin_api.addresses=localhost:19644 --set=brokers=localhost:19092 --set=schema_registry.addresses=localhost:18081
    ```

7.  Create two topics with Iceberg enabled:

    ```bash
    rpk topic create key_value --topic-config=redpanda.iceberg.mode=key_value
    rpk topic create value_schema_id_prefix --topic-config=redpanda.iceberg.mode=value_schema_id_prefix
    ```

8.  Produce data to the `key_value` topic and see data show up.

    ```bash
    echo "hello world" | rpk topic produce key_value --format='%k %v\n'
    ```

9.  Open Redpanda Console at [http://localhost:8081/topics](http://localhost:8081/topics) to see that the topics exist in Redpanda.

10.  Open MinIO at [http://localhost:9001/browser](http://localhost:9001/browser) to view your data stored in the S3-compatible object store.

     Login credentials:

     -   Username: `minio`

     -   Password: `minio123`


11.  Open the Jupyter Notebook server at [http://localhost:8888](http://localhost:8888). The notebook guides you through querying Iceberg tables created from Redpanda topics. Complete the next two steps first before running the code in the notebook.

12.  Create a schema in the Schema Registry:

     ```bash
     rpk registry schema create value_schema_id_prefix-value --schema schema.avsc
     ```

13.  Produce data to the `value_schema_id_prefix` topic:

     ```bash
     echo '{"user_id":2324,"event_type":"BUTTON_CLICK","ts":"2024-11-25T20:23:59.380Z"}\n{"user_id":3333,"event_type":"SCROLL","ts":"2024-11-25T20:24:14.774Z"}\n{"user_id":7272,"event_type":"BUTTON_CLICK","ts":"2024-11-25T20:24:34.552Z"}' | rpk topic produce value_schema_id_prefix --format='%v\n' --schema-id=topic
     ```


When the data is committed, it should be available in Iceberg format and you can query the table `lab.redpanda.value_schema_id_prefix` in the [Jupyter Notebook](http://localhost:8888).

## [](#alternative-query-interfaces)Alternative query interfaces

While the notebook server is running, you can query Iceberg tables directly using Spark’s CLI tools, Instead of Jupyter Notebook:

Spark Shell

```bash
docker exec -it spark-iceberg spark-shell
```

Spark SQL

```bash
docker exec -it spark-iceberg spark-sql
```

PySpark

```bash
docker exec -it spark-iceberg pyspark
```

## [](#clean-up)Clean up

To shut down and delete the containers along with all your cluster data:

```bash
docker compose down -v
```