Docs Self-Managed Manage Whole Cluster Restore This is documentation for Self-Managed v23.3. To view the latest available version of the docs, see v24.3. Whole Cluster Restore for Disaster Recovery With Tiered Storage enabled, you can use whole cluster restore to restore data from a failed cluster (source cluster), including its metadata, onto a new cluster (target cluster). This is a simpler and cheaper alternative to active-active replication, for example with MirrorMaker 2. Use this recovery method to restore your application to the latest functional state as quickly as possible. You cannot use whole cluster restore if the target cluster is in recovery mode. Whole cluster restore is not a fully-functional disaster recovery solution. It does not provide snapshot-style consistency. Some partitions in some topics will be more up-to-date than others. Committed transactions are not guaranteed to be atomic. If you need to restore only a subset of topic data, consider using remote recovery instead of a whole cluster restore. The following metadata is included in a whole cluster restore: Topic definitions. If you have enabled Tiered Storage only for specific topics, topics without Tiered Storage enabled will be restored empty. Users and access control lists (ACLs). Schemas. To ensure that your schemas are also archived and restored, you must also enable Tiered Storage for the _schemas topic. The consumer offsets topic. Some restored committed consumer offsets may be truncated to a lower value than in the original cluster, to keep offsets at or below the highest restored offset in the partition. Transaction metadata, up to the highest committed transaction. In-flight transactions are treated as aborted and will not be included in the restore. Cluster configurations, including your Redpanda license key, with the exception of the following properties: cloud_storage_cache_size cluster_id cloud_storage_access_key cloud_storage_secret_key cloud_storage_region cloud_storage_bucket cloud_storage_api_endpoint cloud_storage_credentials_source cloud_storage_trust_file cloud_storage_backend cloud_storage_credentials_host cloud_storage_azure_storage_account cloud_storage_azure_container cloud_storage_azure_shared_key cloud_storage_azure_adls_endpoint cloud_storage_azure_adls_port Manage source metadata uploads By default, Redpanda uploads cluster metadata to object storage periodically. You can manage metadata uploads for your source cluster, or disable them entirely, with the following cluster configuration properties: enable_cluster_metadata_upload_loop: Enable metadata uploads. This property is enabled by default and is required for whole cluster restore. cloud_storage_cluster_metadata_upload_interval_ms: Set the time interval to wait between metadata uploads. controller_snapshot_max_age_sec: Maximum amount of time that can pass before Redpanda attempts to take a controller snapshot after a new controller command appears. This property affects how current the uploaded metadata can be. Restore data from a source cluster To restore data from a source cluster: Start a target cluster (new cluster). Restore data from a failed source cluster to the new cluster. Prerequisites You must have the following: Tiered Storage enabled on the source cluster. Physical or virtual machines on which to deploy the target cluster. Limitations Whole cluster restore supports only one source cluster. It is not possible to consolidate multiple clusters onto the target cluster. If a duplicate cluster configuration is found in the target cluster, it will be overwritten by the restore. The target cluster should not contain user-managed or application-managed topic data, schemas, users, ACLs, or ongoing transactions. Start a target cluster Follow the steps to deploy a new cluster. Make sure to configure the target cluster with the same Tiered Storage settings as the source cluster. Restore to target cluster You can restore data from a source cluster to a target cluster using the rpk cluster storage restore command. Restore data from the source cluster: rpk cluster storage restore start -w The wait flag (-w) tells the command to poll the status of the restore process and then exit when completed. Check if a rolling restart is required: rpk cluster config status Example output when a restart is required: NODE CONFIG-VERSION NEEDS-RESTART INVALID UNKNOWN 1 4 true [] [] If a restart is required, perform a rolling restart. When the cluster restore is successfully completed successfully, you can redirect your application workload to the new cluster. Make sure to update your application code to use the new addresses of your brokers. Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution Data Archiving Remote Read Replicas