Skip to main content
Version: 23.1

Remote Read Replicas in Kubernetes

Loading...
important

This feature requires an Enterprise license. To upgrade, contact Redpanda sales.

A Remote Read Replica topic is a read-only topic that mirrors a topic on a different cluster. Remote Read Replicas work with both Tiered Storage and archival storage.

When a topic has object storage enabled, you can create a separate remote cluster just for consumers of this topic, and populate its topics from object storage. A read-only topic on a remote cluster can serve any consumer, without increasing the load on the origin cluster. Use cases for Remote Read Replicas include data analytics, offline model training, and development clusters.

You can create Remote Read Replica topics in a Redpanda cluster that directly accesses data stored in cloud object storage. Because these read-only topics access data directly from cloud object storage instead of the topics' origin cluster, there's no impact to the performance of the cluster. Furthermore, topic data can be consumed within a region of your choice, regardless of the region where it was produced.

tip

To create a Remote Read Replica topic in another region, consider using a multi-region bucket to simplify deployment and optimize performance.

Prerequisites

You need the following:

  • An origin cluster with Tiered Storage set up.
  • A topic on the origin cluster, which you can use as a Remote Read Replica topic on the remote cluster.
  • A separate remote cluster in the same region as the bucket or container used for the origin cluster.
    • If you use a multi-region bucket/container, you can create the read replica cluster in any region that has that bucket/container.
    • If you use a single-region bucket/container, the remote cluster must be in the same region as the bucket/container.

This feature requires an Enterprise license. To upgrade, contact Redpanda sales.

To check if you already have a license key applied to your cluster:

kubectl exec redpanda-0 -c redpanda -n redpanda -- rpk cluster license info
note

For default values and documentation for configuration options, see the values.yaml file.

Configure object storage for the remote cluster

You must configure access to the same object storage as the origin cluster.

You can configure access to Amazon S3 with either an IAM role attached to the instance or with access keys.

To configure access to an S3 bucket with an IAM role:

  1. Configure an IAM role with read permissions for the S3 bucket.

  2. Override the following required cluster properties in the Helm chart:

    Replace the following placeholders:

    • <region>: The region of your S3 bucket.
    • <redpanda-bucket-name>: The name of your S3 bucket.
    cloud-storage.yaml
    storage:
    tieredConfig:
    cloud_storage_enabled: true
    cloud_storage_credentials_source: aws_instance_metadata
    cloud_storage_region: <region>
    cloud_storage_bucket: "none"
    helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
    --values cloud-storage.yaml

To configure access to an S3 bucket with access keys instead of an IAM role:

  1. Grant a user the following permissions to read objects on the bucket to be used with the cluster (or on all buckets):

    • GetObject
    • ListBucket
  2. Copy the access key and secret key for the cloud_storage_access_key and cloud_storage_secret_key cluster properties.

  3. Override the following required cluster properties in the Helm chart:

    Replace the following placeholders:

    • <access-key>: The access key for your S3 bucket.
    • <secret-key>: The secret key for your S3 bucket.
    • <region>: The region of your S3 bucket.
    cloud-storage.yaml
    storage:
    tieredConfig:
    cloud_storage_enabled: true
    cloud_storage_credentials_source: config_file
    cloud_storage_access_key: <access-key>
    cloud_storage_secret_key: <secret-key>
    cloud_storage_region: <region>
    cloud_storage_bucket: "none"
    helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
    --values cloud-storage.yaml

Create a Remote Read Replica topic

To create the Remote Read Replica topic, run:

rpk topic create <topic_name> -c redpanda.remote.readreplica=<bucket_name>.

  • For <topic_name>, use the same name as the original topic.
  • For <bucket_name>, use the bucket/container specified in the storage.tieredConfig.cloud_storage_bucket or storage.tieredConfig.cloud_storage_azure_container properties for the origin cluster.
note

Using redpanda.remote.readreplica with redpanda.remote.read or redpanda.remote.write results in an error. This could happen if you configure a topic with Remote Read Replica binding; for example:

rpk topic create topic1 -p 32 -r 3 -c redpanda.remote.read=true -c redpanda.remote.write=false -c redpanda.remote.readreplica=bucket

Reduce lag in data availability

When object storage is enabled on a topic, Redpanda copies closed log segments to the configured object store. Log segments are closed when the value of the segment size has been reached. A topic’s object store thus lags behind the local copy by the log_segment_size or, if set, by the topic's segment.bytes value. To reduce this lag in the data availability for the Remote Read Replica:

  • You can lower the value of segment.bytes. This lets Redpanda archive smaller log segments more frequently, at the cost of increasing I/O and file count.
  • Self-hosted implementations running version 22.3 or higher can set an idle timeout with storage.tieredConfig.cloud_storage_segment_max_upload_interval_sec to force Redpanda to periodically archive the contents of open log segments to object storage. This is useful if a topic’s write rate is low and log segments are kept open for long periods of time. The appropriate interval may depend on your total partition count: a system with less partitions can handle a higher number of segments per partition.

Suggested reading

What do you like about this doc?




Optional: Share your email address if we can contact you about your feedback.

Let us know what we do well: