# Redpanda Migrator

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: Redpanda Migrator
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: connect/cookbooks/redpanda_migrator
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: connect/cookbooks/redpanda_migrator.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/develop/pages/connect/cookbooks/redpanda_migrator.adoc
description: Move your workloads from any Kafka system to Redpanda Cloud using a single command.
page-git-created-date: "2024-10-02"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/develop/connect/cookbooks/redpanda_migrator.md -->

With Redpanda Migrator, you can move your workloads from any Apache Kafka system to Redpanda using a single command. It lets you migrate Kafka messages, schemas, and ACLs quickly and efficiently.

Redpanda Connect’s Redpanda Migrator uses the unified migrator components (available in Redpanda Connect 4.67.5+):

-   [`redpanda_migrator` input](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/inputs/redpanda_migrator/) connects to the source Kafka cluster and Schema Registry.

-   [`redpanda_migrator` output](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/outputs/redpanda_migrator/) handles all migration logic including topic creation, schema synchronization, and consumer group offset translation.


> 📝 **NOTE**
>
> If you’re currently using the legacy `redpanda_migrator_bundle` components, see [Migrate to the Unified Redpanda Migrator](https://docs.redpanda.com/cloud-data-platform/develop/connect/guides/migrate-unified-redpanda-migrator/) for migration instructions.

## [](#create-a-kafka-cluster-and-a-redpanda-cloud-cluster)Create a Kafka cluster and a Redpanda Cloud cluster

First, you need to provision two clusters, a Kafka one called `source` and a Redpanda Cloud one called `destination`. This cookbook uses the following sample connection details throughout the rest of this cookbook:

Source

broker:          source.cloud.kafka.com:9092
schema registry: https://schema-registry-source.cloud.kafka.com:30081
username:        kafka
password:        testpass

Destination

broker:          destination.cloud.redpanda.com:9092
schema registry: https://schema-registry-destination.cloud.redpanda.com:30081
username:        redpanda
password:        testpass

Then you create two topics in the `source` Kafka cluster, `foo` and `bar`, and an ACL for each topic:

```bash
cat > ./config.properties <<EOF
security.protocol=SASL_SSL
sasl.mechanism=SCRAM-SHA-256
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="kafka" password="testpass";
EOF

kafka-topics.sh --bootstrap-server source.cloud.kafka.com:9092 --command-config config.properties --create --if-not-exists --topic foo --replication-factor=3 --partitions=2

kafka-topics.sh --bootstrap-server source.cloud.kafka.com:9092 --command-config config.properties --create --if-not-exists --topic bar --replication-factor=3 --partitions=2

kafka-topics.sh --bootstrap-server source.cloud.kafka.com:9092 --command-config config.properties --list --exclude-internal

kafka-acls.sh --bootstrap-server source.cloud.kafka.com:9092 --command-config config.properties --add --allow-principal User:redpanda --operation Read --topic foo

kafka-acls.sh --bootstrap-server source.cloud.kafka.com:9092 --command-config config.properties --add --deny-principal User:redpanda --operation Read --topic bar

kafka-acls.sh --bootstrap-server source.cloud.kafka.com:9092 --command-config config.properties --list
```

## [](#create-schemas)Create schemas

When the demo clusters are up and running, use [curl](https://curl.se) to create a schema for each topic in the `source` cluster.

```bash
curl -X POST -u "kafka:testpass" -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{\"name\": \"Foo\", \"type\": \"record\", \"fields\": [{\"name\": \"data\", \"type\": \"int\"}]}"}' https://schema-registry-source.cloud.kafka.com:30081/subjects/foo/versions

curl -X POST -u "kafka:testpass" -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{\"name\": \"Bar\", \"type\": \"record\", \"fields\": [{\"name\": \"data\", \"type\": \"int\"}]}"}' https://schema-registry-source.cloud.kafka.com:30081/subjects/bar/versions
```

## [](#generate-messages)Generate messages

Let’s simulate an application with a producer and consumer actively publishing and reading messages on the `source` cluster. You can use Redpanda Connect to generate some Avro-encoded messages and push them to the two topics from the `source` cluster.

1.  Go to the **Connect** page on your cluster and click **Create pipeline**.

2.  In **Pipeline name**, enter a name and add a short description.

3.  For **Compute units**, leave the default value of **1**. Compute units are used to allocate server resources to a pipeline. One compute unit is equivalent to 0.1 CPU and 400 MB of memory.

4.  For **Configuration**, paste the following configuration.

    > 📝 **NOTE**
    >
    > The Brave browser does not fully support code snippets.

    `generate_data.yaml`

    ```yaml
    http:
      enabled: false

    input:
      sequence:
        inputs:
          - generate:
              mapping: |
                let msg = counter()
                root.data = $msg

                meta kafka_topic = match $msg % 2 {
                  0 => "foo"
                  1 => "bar"
                }
              interval: 1s
              count: 0
              batch_size: 1

            processors:
              - schema_registry_encode:
                  url: "https://schema-registry-source.cloud.kafka.com:30081"
                  subject: ${! metadata("kafka_topic") }
                  avro_raw_json: true
                  basic_auth:
                    enabled: true
                    username: kafka
                    password: testpass

    output:
      kafka_franz:
        seed_brokers: [ "source.cloud.kafka.com:9092" ]
        topic: ${! @kafka_topic }
        partitioner: manual
        partition: ${! random_int(min:0, max:1) }
        tls:
          enabled: true
        sasl:
          - mechanism: SCRAM-SHA-256
            username: kafka
            password: testpass
    ```

    > 📝 **NOTE**
    >
    > The Brave browser does not fully support code snippets.

5.  Click **Create**. Your pipeline details are displayed and the pipeline state changes from **Starting** to **Running**, which may take a few minutes. If you don’t see this state change, refresh your page.


Next, add a Redpanda Connect consumer, which reads messages from the `source` cluster topics, and leave it running. This consumer uses the `foobar` consumer group, which is reused in a later step when consuming from the `destination` cluster.

1.  Go to the **Connect** page on your cluster and click **Create pipeline**.

2.  In **Pipeline name**, enter a name and add a short description.

3.  For **Compute units**, leave the default value of **1**.

4.  For **Configuration**, paste the following configuration.

    `read_data_source.yaml`

    ```yaml
    http:
      enabled: false

    input:
      kafka_franz:
        seed_brokers: [ "source.cloud.kafka.com:9092" ]
        topics:
          - '^[^_]' # Skip topics which start with `_`
        regexp_topics: true
        consumer_group: foobar
        tls:
          enabled: true
        sasl:
          - mechanism: SCRAM-SHA-256
            username: kafka
            password: testpass

      processors:
        - schema_registry_decode:
            url: "https://schema-registry-source.cloud.kafka.com:30081"
            avro_raw_json: true
            basic_auth:
              enabled: true
              username: kafka
              password: testpass

    output:
      stdout: {}
      processors:
        - mapping: |
            root = this.merge({"count": counter(), "topic": @kafka_topic, "partition": @kafka_partition})
    ```

    > 📝 **NOTE**
    >
    > The Brave browser does not fully support code snippets.

5.  Click **Create**. Your pipeline details are displayed and the pipeline state changes from **Starting** to **Running**, which may take a few minutes. If you don’t see this state change, refresh your page.


At this point, the `source` cluster has some data in both `foo` and `bar` topics, and the consumer prints the messages it reads from these topics to `stdout`.

## [](#configure-and-start-redpanda-migrator)Configure and start Redpanda Migrator

The unified Redpanda Migrator does the following:

-   The `redpanda_migrator` input connects to the source Kafka cluster and Schema Registry to consume messages and schema information.

-   The `redpanda_migrator` output handles all migration logic:

    -   Schema migration: reads schemas from the source Schema Registry and synchronizes them to the destination.

    -   Topic creation: automatically creates destination topics that don’t exist with proper configurations.

    -   ACL migration: migrates access control lists according to the migration rules.

    -   Message streaming: processes and routes messages from source to destination topics.

    -   Consumer group offset translation: maps source consumer group offsets to equivalent destination positions.


-   If new topics are created in the source cluster while the migrator is running, they are migrated when messages are written to them.


ACL migration for topics adheres to the following principles:

-   `ALLOW WRITE` ACLs for topics are not migrated

-   `ALLOW ALL` ACLs for topics are downgraded to `ALLOW READ`

-   Group ACLs are not migrated


> 📝 **NOTE**
>
> Changing topic configurations, such as partition count, isn’t currently supported.

Now, use the following unified Redpanda Migrator configuration. See the [`redpanda_migrator` input](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/inputs/redpanda_migrator/) and [`redpanda_migrator` output](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/outputs/redpanda_migrator/) docs for details.

1.  Go to the **Connect** page on your cluster and click **Create pipeline**.

2.  In **Pipeline name**, enter a name and add a short description.

3.  For **Compute units**, leave the default value of **1**.

4.  For **Configuration**, paste the following configuration.

    `redpanda_migrator.yaml`

    ```yaml
    input:
      label: "migration_pipeline" (1)
      redpanda_migrator:
        # Source Kafka settings
        seed_brokers: [ "source.cloud.kafka.com:9092" ]
        topics:
          - '^[^_]' # Skip internal topics which start with `_`
        regexp_topics: true
        consumer_group: migrator
        tls:
          enabled: true
        sasl:
          - mechanism: SCRAM-SHA-256
            username: kafka
            password: testpass

        # Source Schema Registry settings
        schema_registry:
          url: "https://schema-registry-source.cloud.kafka.com:30081"
          basic_auth:
            enabled: true
            username: kafka
            password: testpass

    output:
      label: "migration_pipeline" (2)
      redpanda_migrator:
        # Destination Redpanda settings
        seed_brokers: [ "destination.cloud.redpanda.com:9092" ]
        tls:
          enabled: true
        sasl:
          - mechanism: SCRAM-SHA-256
            username: redpanda
            password: testpass

        # Destination Schema Registry and migration settings
        schema_registry:
          url: https://schema-registry-destination.cloud.redpanda.com:30081
          include_deleted: true
          translate_ids: true
          basic_auth:
            enabled: true
            username: redpanda
            password: testpass

        # Consumer group migration settings
        consumer_groups:
          enabled: true
          interval: 30s

        serverless: false (3)
    ```


> 💡 **TIP**
>
> Label names must be between 3 and 128 characters and can only contain alphanumeric characters, hyphens, and underscores (`A-Za-z0-9-_`).

## [](#check-the-status-of-migrated-topics)Check the status of migrated topics

You can use the Redpanda [`rpk` CLI tool](https://docs.redpanda.com/streaming/current/get-started/rpk/) to check which topics and ACLs have been migrated to the `destination` cluster. You can quickly [install `rpk`](https://docs.redpanda.com/streaming/current/get-started/rpk-install/) if you don’t already have it.

> 📝 **NOTE**
>
> For now, users require manual migration. However, this step is not required for the current demo. Similarly, roles are specific to Redpanda and, for now, also require manual migration if the `source` cluster is based on Redpanda.

```bash
rpk -X brokers=destination.cloud.redpanda.com:9092 -X tls.enabled=true -X sasl.mechanism=SCRAM-SHA-256 -X user=redpanda -X pass=testpass topic list
NAME      PARTITIONS  REPLICAS
_schemas  1           1
bar       2           1
foo       2           1

rpk -X brokers=destination.cloud.redpanda.com:9092 -X tls.enabled=true -X sasl.mechanism=SCRAM-SHA-256 -X user=redpanda -X pass=testpass security acl list
PRINCIPAL      HOST  RESOURCE-TYPE  RESOURCE-NAME  RESOURCE-PATTERN-TYPE  OPERATION  PERMISSION  ERROR
User:redpanda  *     TOPIC          bar            LITERAL                READ       DENY
User:redpanda  *     TOPIC          foo            LITERAL                READ       ALLOW
```

## [](#check-metrics-to-monitor-progress)Check metrics to monitor progress

Redpanda Connect provides a comprehensive suite of metrics in various formats, such as Prometheus, which you can use to monitor its performance in your observability stack. Besides the [standard Redpanda Connect metrics](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/metrics/about/#metric-names), the `redpanda_migrator` input also emits an `input_redpanda_migrator_lag` metric for monitoring the migration progress of each topic and partition.

To monitor the migration progress, use the Redpanda Cloud OpenMetrics endpoint, which exposes all Redpanda and connector metrics for your cluster. You can integrate this endpoint with Prometheus, Datadog, or other observability platforms.

For step-by-step instructions on configuring monitoring and connecting your observability tool, see [Monitor Redpanda Cloud](https://docs.redpanda.com/cloud-data-platform/manage/monitor-cloud/).

After ingesting the metrics, search for the `input_redpanda_migrator_lag` metric in your monitoring tool and filter by `topic` and `partition` as needed to track migration lag for each topic and partition.

## [](#read-from-the-migrated-topics)Read from the migrated topics

Stop the `read_data_source.yaml` consumer you started earlier and, afterwards, start a similar consumer for the `destination` cluster. Before starting the consumer up on the `destination` cluster, make sure you give the migrator bundle some time to replicate the translated offset.

1.  On the **Connect** page, stop the `read_data_source` pipeline you created earlier.

2.  Go to the **Connect** page on your cluster and click **Create pipeline**.

3.  In **Pipeline name**, enter a name and add a short description.

4.  For **Compute units**, leave the default value of **1**.

5.  For **Configuration**, paste the following configuration.

    `read_data_destination.yaml`

    ```yaml
    http:
      enabled: false

    input:
      kafka_franz:
        seed_brokers: [ "destination.cloud.redpanda.com:9092" ]
        topics:
          - '^[^_]' # Skip topics which start with `_`
        regexp_topics: true
        consumer_group: foobar
        sasl:
          - mechanism: SCRAM-SHA-256
            username: redpanda
            password: testpass

      processors:
        - schema_registry_decode:
            url: "https://schema-registry-destination.cloud.redpanda.com:30081"
            avro_raw_json: true
            basic_auth:
              enabled: true
              username: redpanda
              password: testpass

    output:
      stdout: {}
      processors:
        - mapping: |
            root = this.merge({"count": counter(), "topic": @kafka_topic, "partition": @kafka_partition})
    ```

    > 📝 **NOTE**
    >
    > The Brave browser does not fully support code snippets.

6.  Click **Create**. Your pipeline details are displayed and the pipeline state changes from **Starting** to **Running**, which may take a few minutes. If you don’t see this state change, refresh your page.


The `source` cluster consumer uses the same `foobar` consumer group. This consumer resumes reading messages from where the `source` consumer left off.

Redpanda Migrator performs offset remapping when migrating consumer group offsets to the `destination` cluster. While more sophisticated approaches are possible, Redpanda chose to use a simple timestamp-based approach. So, for each migrated offset, the `destination` cluster is queried to find the latest offset before the received offset timestamp. Redpanda Migrator then writes this offset as the `destination` consumer group offset for the corresponding topic and partition pair.

Although the timestamp-based approach doesn’t guarantee exactly-once delivery, it minimizes the likelihood of message duplication and avoids the need for complex and error-prone offset remapping logic.