Docs Cloud Manage Disaster Recovery Shadowing Monitor Shadowing Monitor Shadowing Page options Copy as Markdown Copied! View as plain text Ask AI about this topic Add MCP server to VS Code Monitor your shadow links to ensure proper replication performance and understand your disaster recovery readiness. Use rpk commands, metrics, and status information to track shadow link health and troubleshoot issues. See Failover Runbook for immediate step-by-step disaster procedures. Status commands To list existing shadow links: Cloud UI rpk Control Plane API At the organization level of the Cloud UI, navigate to Shadow Link. rpk shadow list curl 'https://api.redpanda.com/v1/shadow-links' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" To view shadow link configuration details: Cloud UI rpk Control Plane API From the Shadow Link page, select the shadow link you want to view. Click the Tasks tab to view all tasks and their status. rpk shadow describe <shadow-link-name> For detailed command options, see rpk shadow list and rpk shadow describe. This command shows the complete configuration of the shadow link, including connection settings, filters, and synchronization options. curl 'https://api.redpanda.com/v1/shadow-links/<shadow-link-id>' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" To check your shadow link status and ensure proper operation: Cloud UI rpk Cloud API From the Shadow Link page, select the shadow link you want to view. Click the Tasks tab to view all tasks and their status. rpk shadow status <shadow-link-name> For troubleshooting specific issues, you can use command options to show individual status sections. See rpk shadow status for available status options. The status output includes the following: # Get Data Plane API URL of shadow cluster export DATAPLANE_API_URL=`curl https://api.cloud.redpanda.com/v1/clusters/<shadow-cluster-id> \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" | jq .cluster.dataplane_api` curl "https://$DATAPLANE_API_URL/v1/shadowlinks/<shadow-link-name>" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" # View topic state curl "https://$DATAPLANE_API_URL/v1/shadowlinks/<shadow-link-name>/topic" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" The status includes the following: Shadow link state: Overall operational state (ACTIVE, PAUSED). Individual topic states: Current state of each replicated topic (ACTIVE, FAULTED, FAILING_OVER, FAILED_OVER, PAUSED). Task status: Health of replication tasks across brokers (ACTIVE, FAULTED, NOT_RUNNING, LINK_UNAVAILABLE). For details about shadow link tasks, see Shadow link tasks. Lag information: Replication lag per partition showing source vs shadow high watermarks (HWM). Metrics Shadowing provides comprehensive metrics to track replication performance and health with the public_metrics endpoint. Metric Type Description redpanda_shadow_link_client_errors Counter Total number of errors encountered by the Kafka client during shadow link operations. Monitor by shadow_link_name to identify connection issues, authentication failures, or other client-side problems. redpanda_shadow_link_shadow_lag Gauge The lag of the shadow partition against the source partition, calculated as source partition LSO (Last Stable Offset) minus shadow partition HWM (High Watermark). Monitor by shadow_link_name, topic, and partition to understand replication lag for each partition. redpanda_shadow_link_total_bytes_fetched Counter The total number of bytes fetched by a sharded replicator (bytes received by the client). Labeled by shadow_link_name and shard to track data transfer volume from the source cluster. redpanda_shadow_link_total_bytes_written Counter The total number of bytes written by a sharded replicator (bytes written to the write_at_offset_stm). Uses shadow_link_name and shard labels to monitor data written to the shadow cluster. redpanda_shadow_link_shadow_topic_state Gauge Number of shadow topics in the respective states. Labeled by shadow_link_name and state to monitor topic state distribution across your shadow links. redpanda_shadow_link_total_records_fetched Counter The total number of records fetched by the sharded replicator (records received by the client). Monitor by shadow_link_name and shard to track message throughput from the source. redpanda_shadow_link_total_records_written Counter The total number of records written by a sharded replicator (records written to the write_at_offset_stm). Uses shadow_link_name and shard labels to monitor message throughput to the shadow cluster. For detailed descriptions of each metric, including usage examples and label definitions, see Shadow Link metrics reference. Monitoring best practices Health check procedures Establish regular monitoring workflows to ensure shadow link health: Cloud UI rpk Cloud API From the Shadow Link page, select the shadow link you want to view. Click the Tasks tab to view all tasks and their status. # Check all shadow links are active rpk shadow list | grep -v "ACTIVE" || echo "All shadow links healthy" # Monitor lag for critical topics rpk shadow status <shadow-link-name> | grep -E "LAG|Lag" # Check all shadow links are active curl 'https://api.redpanda.com/v1/shadow-links' \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" | \ jq -r 'if all(.state == "SHADOW_LINK_STATE_ACTIVE") then "All shadow links healthy" else .[] | select(.state != "SHADOW_LINK_STATE_ACTIVE") end' # Monitor lag for critical topics curl "https://$DATAPLANE_API_URL/v1/shadowlinks/<shadow-link-name>/topic" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${RP_CLOUD_TOKEN}" Alert conditions Configure monitoring alerts for the following conditions, which indicate problems with Shadowing: High replication lag: When redpanda_shadow_link_shadow_lag exceeds your recovery point objective (RPO) requirements Topic state changes: When topics move to FAULTED state Task failures: When replication tasks enter FAULTED or NOT_RUNNING states Throughput drops: When bytes/records fetched drops significantly Link unavailability: When tasks show LINK_UNAVAILABLE indicating source cluster connectivity issues For more information about shadow link tasks and their states, see Shadow link tasks. Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback! Configure Shadowing Failover