Data Archiving

This feature requires an Enterprise license for self-hosted deployments. To upgrade, contact Redpanda sales.

With data archiving, you can enable remote write to back up topics to cloud storage. In the event of a data center failure, data corruption, or cluster migration, you can recover your archived data from the cloud back to your cluster. Data archiving is a use case of Tiered Storage.

Redpanda natively supports Tiered Storage with Amazon S3, Google Cloud Storage (GCS), and Microsoft Azure Blob Storage (ABS). Migrating topics from one cloud provider to another is not supported.

Prerequisites

This feature requires an Enterprise license for self-hosted deployments. To upgrade, contact Redpanda sales.

To check if you already have a license key applied to your cluster:

rpk cluster license info

Configure data archiving

Data archiving requires a Tiered Storage configuration.

  1. Set up Tiered Storage for the cluster or for specific topics.

  2. If you want to read data from cloud storage, enable remote read. When remote read is disabled, data can only be read from local storage.

  3. Set retention limits.

To recover a topic from cloud storage, use remote recovery.

Stop data archiving

To cancel archiving jobs, disable remote write.

To delete archival data, adjust retention.ms or retention.bytes.