Docs Self-Managed Manage Disaster Recovery Shadowing Configure Failover You are viewing the Self-Managed v25.3 beta documentation. We welcome your feedback at the Redpanda Community Slack #beta-feedback channel. To view the latest available version of the docs, see v25.2. Configure Failover This feature requires an enterprise license. To get a trial license key or extend your trial period, generate a new trial license key. To purchase a license, contact Redpanda Sales. If Redpanda has enterprise features enabled and it cannot find a valid license, restrictions apply. See Failover Runbook for immediate step-by-step disaster procedures. Failover is the process of modifying shadow topics or an entire shadow cluster from read-only replicas to fully writable resources, and ceasing replication from the source cluster. You can fail over individual topics for selective workload migration or fail over the entire cluster for comprehensive disaster recovery. This critical operation transforms your shadow resources into operational production assets, allowing you to redirect application traffic when the source cluster becomes unavailable. Failover behavior When you initiate failover, Redpanda performs the following operations: Stops replication: Halts all data fetching from the source cluster for the specified topics or entire shadow link Failover topics: Converts read-only shadow topics into regular, writable topics Updates topic state: Changes topic status from ACTIVE to FAILING_OVER, then FAILED_OVER Topic failover is irreversible. Once failed over, topics cannot return to shadow mode, and automatic fallback to the original source cluster is not supported. Failover commands You can perform failover at different levels of granularity to match your disaster recovery needs: Individual topic failover To fail over a specific shadow topic while leaving other topics in the shadow link still replicating: rpk shadow failover <shadow-link-name> --topic <topic-name> Use this approach when you need to selectively failover specific workloads or when testing failover procedures. Complete shadow link failover (cluster failover) To fail over all shadow topics associated with the shadow link simultaneously: rpk shadow failover <shadow-link-name> --all Use this approach during a complete regional disaster when you need to activate the entire shadow cluster as your new production environment. Force delete shadow link (emergency failover) rpk shadow delete <shadow-link-name> --force Force deleting a shadow link is irreversible and immediately fails over all topics in the link, bypassing the normal failover state transitions. This action should only be used as a last resort when topics are stuck in transitional states and you need immediate access to all replicated data. Failover states Shadow link states The shadow link itself has a simple state model: ACTIVE: Shadow link is operating normally, replicating data Shadow links do not have dedicated failover states. Instead, the link’s operational status is determined by the collective state of its shadow topics. Shadow topic states Individual shadow topics progress through specific states during failover: ACTIVE: Normal replication state before failover FAULTED: Shadow topic has encountered an error and is not replicating FAILING_OVER: Failover initiated, replication stopping FAILED_OVER: Failover completed successfully, topic fully writable Monitor failover progress Monitor failover progress using the status command: rpk shadow status <shadow-link-name> The output shows individual topic states and any issues encountered during the failover process. Task states during monitoring: ACTIVE: Task is operating normally and replicating data FAULTED: Task encountered an error and requires attention NOT_RUNNING: Task is not currently executing LINK_UNAVAILABLE: Task cannot communicate with the source cluster Post-failover cluster behavior After successful failover, your shadow cluster exhibits the following characteristics: Topic accessibility: Failed over topics become fully writable and readable. Applications can produce and consume messages normally. All Kafka APIs are available for failedover topics. Original offsets and timestamps are preserved. Shadow link status: The shadow link remains but stops replicating data. Link status shows topics in FAILED_OVER state. You can safely delete the shadow link after successful failover. Operational limitations: No automatic fallback mechanism to the original source cluster. Data transforms remain disabled until you manually re-enable them. Audit log history from the source cluster is not available (new audit logs begin immediately). Failover considerations and limitations Data consistency: Some data loss may occur due to replication lag at the time of failover. Consumer group offsets are preserved, allowing applications to resume from their last committed position. In-flight transactions at the source cluster are not replicated and will be lost. Recovery-point-objective (RPO): The amount of potential data loss depends on replication lag when disaster occurs. Monitor lag metrics to understand your effective RPO. Network partitions: If the source cluster becomes accessible again after failover, do not attempt to write to both clusters simultaneously. This creates a scenario with potential data inconsistencies, since metadata starts to diverge. Testing requirements: Regularly test failover procedures in non-production environments to validate your disaster recovery processes and measure RTO. Next steps After completing failover: Update your application connection strings to point to the shadow cluster Verify that applications can produce and consume messages normally Consider deleting the shadow link if failover was successful and permanent For emergency situations, see Failover Runbook. Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback! Monitor Shadowing Failover Runbook