# Migrate Data to Redpanda with MirrorMaker 2

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [streaming-full.txt](https://docs.redpanda.com/streaming-full.txt)

---
title: Migrate Data to Redpanda with MirrorMaker 2
latest-operator-version: v26.1.4
# EOL = End-of-Life (support lifecycle status)
page-is-nearing-eol: "false"
page-is-past-eol: "true"
page-eol-date: July 31, 2025
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
docname: migrate/data-migration
page-component-name: streaming
page-version: "24.2"
page-component-version: "24.2"
page-component-title: Streaming
page-relative-src-path: migrate/data-migration.adoc
page-edit-url: https://github.com/redpanda-data/docs/edit/v/24.2/modules/upgrade/pages/migrate/data-migration.adoc
description: Use MirrorMaker 2 to replicate data between Redpanda clusters.
page-git-created-date: "2023-08-17"
page-git-modified-date: "2024-07-12"
support-status: past end-of-life
---

<!-- Source: https://docs.redpanda.com/streaming/24.2/upgrade/migrate/data-migration.md -->

When you want to migrate data from Kafka or replicate data between Redpanda clusters, you can use the MirrorMaker 2 (MM2) tool bundled in the Kafka download package. This section describes the process of configuring MirrorMaker 2 to replicate data from one Redpanda cluster to another.

MirrorMaker 2 reads a configuration file that specifies the connection details for the Redpanda clusters, as well as other options. After you start MirrorMaker 2, the data migration continues according to the configuration at launch time until you shutdown the MirrorMaker 2 process.

See the [Kafka downloads page](https://kafka.apache.org/downloads) for download instructions, and the [Kafka documentation](https://kafka.apache.org/documentation/#georeplication) for details on cross-cluster data mirroring with MirrorMaker 2.

## [](#prerequisites)Prerequisites

To set up the replication, you need:

-   2 Redpanda clusters - You can set up these clusters in any deployment you like, including [Kubernetes](https://docs.redpanda.com/streaming/24.2/deploy/deployment-option/self-hosted/kubernetes/get-started-dev/), [Docker](https://docs.redpanda.com/streaming/24.2/get-started/quick-start/), and [Linux](https://docs.redpanda.com/streaming/24.2/deploy/deployment-option/self-hosted/manual/).

-   MirrorMaker 2 host - You install MirrorMaker 2 on a separate system or on one of the Redpanda clusters, as long as the IP addresses and ports on each cluster are accessible from the MirrorMaker 2 host. You must install the [Java Runtime Engine (JRE)](https://docs.oracle.com/javase/10/install/toc.htm) on the MirrorMaker 2 host.


## [](#install-mirrormaker-2)Install MirrorMaker 2

MirrorMaker 2 is run by a shell script that is part of the Kafka download package. To install MirrorMaker 2 on the machine that you want to run the replication between the clusters:

1.  Download the latest [Kafka download package](https://kafka.apache.org/downloads). For example:

    ```bash
    curl -O https://dlcdn.apache.org/kafka/3.3.1/kafka_2.13-3.3.1.tgz
    ```

2.  Extract the files from the archive:

    ```bash
    tar -xvf kafka_2.13-3.3.1.tgz
    ```


## [](#create-the-mirrormaker-2-configuration-files)Create the MirrorMaker 2 configuration files

MirrorMaker 2 uses configuration files to get the connection details for the clusters. You can find the MirrorMaker 2 script and configuration files in the expanded Kafka directory.

.
└── kafka\_2.13-3.3.1
    └── bin
        └── connect-mirror-maker.sh
    └── config
        └── connect-mirror-maker.properties

The sample configuration describes a number of the settings for MirrorMaker 2.

To create a basic configuration file, go to the `config` and run:

```bash
cat << EOF > mm2.properties
// Name the clusters
clusters = <cluster_name_1>, <cluster_name_2>

// Assign IP addresses to the cluster names
<cluster_name_1>.bootstrap.servers = <cluster_1_ip>:9092
<cluster_name_2>.bootstrap.servers = <cluster_2_ip>:9092

// Set replication for all topics from Redpanda 1 to Redpanda 2
<cluster_name_1>-><cluster_name_2>.enabled = true
<cluster_name_1>-><cluster_name_2>.topics = .*

// Setting replication factor of newly created remote topics
replication.factor = 1

//Make sure that your target cluster can accept larger message sizes. For example, 30MB messages
<cluster_name_2>.producer.max.request.size=31457280

//Set the frequency at which to check the source cluster for new topics (default is 10 minutes)
refresh.topics.interval.seconds = 5

//Make sure topics are created without a prefix
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy
EOF
```

> 📝 **NOTE**
>
> Remember to edit the variable names on the examples for your environment. For example, `<cluster_name_1>` can become `redpanda_1`.

## [](#run-mirrormaker-2-to-start-replication)Run MirrorMaker 2 to start replication

To start MirrorMaker 2 in the `kafka_2.13-3.3.1/bin/` directory, run:

```bash
./kafka_2.13-3.3.1/bin/connect-mirror-maker.sh mm2.properties
```

With this command, MirrorMaker 2 consumes all topics from the `<cluster_name_1>` cluster and replicates them into the `<cluster_name_2>` cluster. MirrorMaker 2 adds the prefix `<cluster_name_1>` to the names of replicated topics.

## [](#change-topic-replication-factor)Change topic replication factor

The `replication.factor` property is a topic property set at topic creation. A replication factor of three generally provides a good balance between broker loss and replication overhead.

You can increase or decrease the replication factor on an existing topic. For example, if topics were initially created in a test environment with a replication factor of `1` (no replication), to change the topic replication factor to `3`, run:

```bash
rpk topic alter-config [TOPICS...] --set replication.factor=3
```

## [](#see-migration-in-action)See migration in action

Here are the basic commands to produce and consume streams:

1.  Create a topic in the source cluster called "twitch\_chat":

    ```bash
     rpk topic create twitch_chat -X brokers=<node IP>:<kafka API port>
    ```

2.  Produce messages to the topic:

    ```bash
     rpk topic produce twitch_chat -X brokers=<node IP>:<kafka API port>
    ```

    Type text into the topic and press Ctrl + D to separate between messages.

    Press Ctrl + C to exit the produce command.

3.  Consume (read) the messages from the destination cluster:

    ```bash
     rpk topic consume twitch_chat -X brokers=<node IP>:<kafka API port>
    ```

    Each message is shown with its metadata like the following:

    ```json
     {
        "message": "How do you stream with Redpanda?\n",
        "partition": 0,
        "offset": 1,
        "timestamp": "2021-02-10T15:52:35.251+02:00"
     }
    ```


Now you know the replication is working.

## [](#stop-mirrormaker-2)Stop MirrorMaker 2

To stop the MirrorMaker 2 process, use `top` to find its process ID, and then run: `kill <MirrorMaker 2 pid>`

## [](#message-size)Message size

When replicating larger message sizes with MirrorMaker 2 on the target cluster, you may get blocked with an error:

org.apache.kafka.common.errors.RecordTooLargeException: The message is xxxx bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.

To address this issue, make sure that your `mm2.properties` configuration file on the target cluster allows bigger messages sizes. For example, for 30MB messages, you’d have the following line in the configuration file:

```bash
<cluster_name_2>.producer.max.request.size=31457280
```

## [](#running-mirrormaker-2-as-a-service)Running MirrorMaker 2 as a service

For production usage Redpanda recommends that you run MirrorMaker 2 as a systemd unit file.

To run MirrorMaker 2 as a systemd unit file:

1.  Edit `/etc/systemd/system/multi-user.target.wants/mm2.service` and add the following:

    ```ini
    [Unit]
    Description=Mirror Maker 2 service
    After=network.target
    #StartLimitIntervalSec=0
    [Service]
    Type=simple
    Restart=always
    LimitNOFILE=49152
    RestartSec=1
    User=root
    Environment=JAVA_HOME=/usr/lib/jvm/java-11-amazon-corretto
    ExecStart=/home/ec2-user/kafka_2.13-3.3.1/bin/connect-mirror-maker.sh /home/ec2-user/mm2.properties

    # Output to syslog
    StandardOutput=syslog
    StandardError=syslog
    SyslogIdentifier=mm2

    [Install]
    WantedBy=multi-user.target
    ```

    > 📝 **NOTE**
    >
    > The home directory and where you are running MirrorMaker2 from may vary. Note the Kafka folder location, as it may vary by version.

2.  Run:

    ```bash
    sudo systemctl daemon-reload
    ```

3.  Run:

    ```bash
    sudo systemctl start mm2.service
    ```


You can follow the progress with the `tail` command:

### Fedora/RedHat

```bash
tail -f /var/log/messages | grep mm2
```

### Debian/Ubuntu

```bash
tail -f /var/log/syslog | grep mm2
```

## [](#troubleshooting)Troubleshooting

If you run into any difficulty with data migration, you can request help in the Redpanda [Slack](https://redpanda.com/slack) community.

## Suggested labs

-   [Migrate Data with Redpanda Migrator](https://docs.redpanda.com/labs/docker-compose/redpanda-migrator/)

[Search all labs](https://docs.redpanda.com/labs)