Docs Self-Managed Monitor Kubernetes Shadow Links Monitor Kubernetes Shadow Links Page options Copy as Markdown Copied! View as plain text Ask AI about this topic Add MCP server to VS Code This feature requires an enterprise license. To get a trial license key or extend your trial period, generate a new trial license key. To purchase a license, contact Redpanda Sales. If Redpanda has enterprise features enabled and it cannot find a valid license, restrictions apply. Monitor your shadow links to ensure proper replication performance and understand your disaster recovery readiness. For Kubernetes deployments, you can monitor shadow links using the Redpanda Operator’s ShadowLink resource status or by using rpk commands directly. See Kubernetes Failover Runbook for immediate step-by-step disaster procedures. Status commands Operator Helm To list existing shadow links: kubectl get shadowlink --namespace <shadow-namespace> Example output NAME SYNCED link True A healthy shadow link shows True for SYNCED. If SYNCED is False, use kubectl describe to investigate the issue. To view detailed shadow link status and configuration: kubectl describe shadowlink --namespace <shadow-namespace> <shadowlink-name> Example output Name: link Namespace: redpanda-system API Version: cluster.redpanda.com/v1alpha2 Kind: ShadowLink Status: Conditions: Status: True Type: Synced Message: Shadow link is synced Shadow Topics: Name: orders State: active Name: inventory State: active Tasks: Name: Source Topic Sync State: active Name: Consumer Group Shadowing State: active Name: Security Migrator State: active The kubectl describe output shows: Shadow link state: Overall operational state in the Status section Individual topic states: Current state of each replicated topic under Shadow Topics Task status: Health of replication tasks under Tasks Sync status: Whether the resource is properly synced (Synced: True in conditions) Configuration: Complete shadow link configuration including connection settings and filters Look for Synced: True in Conditions and active state for topics and tasks. For more detailed monitoring or troubleshooting, you can also use rpk commands as shown in the Helm tab. To list existing shadow links: kubectl exec --namespace <shadow-namespace> <shadow-pod-name> --container redpanda -- \ rpk shadow list Example output NAME UID STATE disaster-recovery-link 70f25b41-9bad-4e31-9f81-d302c8676397 ACTIVE To view shadow link configuration details: kubectl exec --namespace <shadow-namespace> <shadow-pod-name> --container redpanda -- \ rpk shadow describe <shadow-link-name> For detailed command options, see rpk shadow list and rpk shadow describe. This command shows the complete configuration of the shadow link, including connection settings, filters, and synchronization options. To check your shadow link status and ensure proper operation: kubectl exec --namespace <shadow-namespace> <shadow-pod-name> --container redpanda -- \ rpk shadow status <shadow-link-name> Example output OVERVIEW === NAME disaster-recovery-link UID 70f25b41-9bad-4e31-9f81-d302c8676397 STATE ACTIVE TASKS === NAME BROKER_ID SHARD STATE REASON Source Topic Sync 0 0 ACTIVE Source Topic Sync has started Consumer Group Shadowing 0 0 ACTIVE Group mirroring task finished successfully Security Migrator Task 0 0 ACTIVE Security Migrator Task has started TOPICS === Name: orders, State: ACTIVE PARTITION SRC_LSO SRC_HWM DST_HWM LAG 0 1000 1234 1230 4 1 2000 2456 2450 6 Name: inventory, State: ACTIVE PARTITION SRC_LSO SRC_HWM DST_HWM LAG 0 500 789 789 0 Key indicators: STATE: ACTIVE: Shadow link is replicating Tasks: ACTIVE: All replication tasks are running LAG: Message count difference between source and shadow (lower is better) For troubleshooting specific issues, you can use command options to show individual status sections. See rpk shadow status for available status options. The status output includes the following: Shadow link state: Overall operational state (ACTIVE, PAUSED). Individual topic states: Current state of each replicated topic (ACTIVE, FAULTED, FAILING_OVER, FAILED_OVER, PAUSED). Task status: Health of replication tasks across brokers (ACTIVE, FAULTED, NOT_RUNNING, LINK_UNAVAILABLE). For details about shadow link tasks, see Shadow link tasks. Lag information: Replication lag per partition showing source vs shadow high watermarks (HWM). Troubleshoot Topics in FAULTED state When monitoring shadow links, you may see topics showing FAULTED state in status output. Check shadow cluster logs for specific error messages: kubectl logs --namespace <shadow-namespace> <shadow-pod-name> --container redpanda | grep -i "shadow\|error" Common causes include: Source topic deleted: topic no longer exists on source cluster Permission denied: shadow link service account lacks required permissions Network interruption: temporary connectivity issues If the source topic still exists and should be replicated, delete and recreate the shadow link to reset the faulted state. High replication lag When monitoring shadow links, you may see LAG values continuously increasing in rpk shadow status. Check the following: Check source cluster load: high produce rate may exceed replication capacity Check shadow cluster resources: CPU, memory, or disk constraints Check network bandwidth: verify sufficient bandwidth between clusters To resolve: Scale shadow cluster resources if constrained Verify network connectivity and bandwidth Review topic configuration for optimization opportunities Task shows LINK_UNAVAILABLE When monitoring shadow links, you may see tasks showing LINK_UNAVAILABLE state with "No brokers available" message. Common causes include: Source cluster requires SASL authentication but shadow link not configured for it Source cluster unreachable from shadow cluster Network policy blocking traffic between clusters To resolve: Verify SASL configuration if source cluster requires authentication Test network connectivity: kubectl exec into shadow pod and try connecting to source cluster Check Kubernetes NetworkPolicies and firewall rules Metrics Shadowing provides comprehensive metrics to track replication performance and health with the public_metrics endpoint. Metric Type Description redpanda_shadow_link_shadow_lag Gauge The lag of the shadow partition against the source partition, calculated as source partition LSO (Last Stable Offset) minus shadow partition HWM (High Watermark). Monitor by shadow_link_name, topic, and partition to understand replication lag for each partition. redpanda_shadow_link_total_bytes_fetched Count The total number of bytes fetched by a sharded replicator (bytes received by the client). Labeled by shadow_link_name and shard to track data transfer volume from the source cluster. redpanda_shadow_link_total_bytes_written Count The total number of bytes written by a sharded replicator (bytes written to the write_at_offset_stm). Uses shadow_link_name and shard labels to monitor data written to the shadow cluster. redpanda_shadow_link_client_errors Count The number of errors seen by the client. Track by shadow_link_name and shard to identify connection or protocol issues between clusters. redpanda_shadow_link_shadow_topic_state Gauge Number of shadow topics in the respective states. Labeled by shadow_link_name and state to monitor topic state distribution across your shadow links. redpanda_shadow_link_total_records_fetched Count The total number of records fetched by the sharded replicator (records received by the client). Monitor by shadow_link_name and shard to track message throughput from the source. redpanda_shadow_link_total_records_written Count The total number of records written by a sharded replicator (records written to the write_at_offset_stm). Uses shadow_link_name and shard labels to monitor message throughput to the shadow cluster. See also: Public Metrics Monitoring best practices Health check procedures Establish regular monitoring workflows to ensure shadow link health: Operator Helm # Check all shadow links are synced and healthy kubectl get shadowlink --namespace <shadow-namespace> # View detailed status for a specific shadow link kubectl describe shadowlink --namespace <shadow-namespace> <shadowlink-name> # Check for any shadow links with issues (not synced) kubectl get shadowlink --namespace <shadow-namespace> -o json | \ jq '.items[] | select(.status.conditions[] | select(.type=="Synced" and .status!="True")) | .metadata.name' # Check all shadow links are active kubectl exec --namespace <shadow-namespace> <shadow-pod-name> --container redpanda -- \ rpk shadow list | grep -v "ACTIVE" || echo "All shadow links healthy" # Monitor lag for critical topics kubectl exec --namespace <shadow-namespace> <shadow-pod-name> --container redpanda -- \ rpk shadow status <shadow-link-name> | grep -E "LAG|Lag" Alert conditions Configure monitoring alerts for the following conditions, which indicate problems with Shadowing: High replication lag: When redpanda_shadow_link_shadow_lag exceeds your RPO requirements Connection errors: When redpanda_shadow_link_client_errors increases rapidly Topic state changes: When topics move to FAULTED state Task failures: When replication tasks enter FAULTED or NOT_RUNNING states Throughput drops: When bytes/records fetched drops significantly Link unavailability: When tasks show LINK_UNAVAILABLE indicating source cluster connectivity issues For more information about shadow link tasks and their states, see Shadow link tasks. Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback!