Internal Metrics Reference
This section provides reference descriptions about the internal metrics exported from Redpanda’s /metrics
endpoint.
Use /public_metrics for your primary dashboards for system health. Use /metrics for detailed analysis and debugging. |
In a live system, Redpanda metrics are exported only for features that are in use. For example, a metric for consumer groups is not exported when no groups are registered. To see the available internal metrics in your system, query the
|
Internal metrics
Most internal metrics are useful for debugging. The following subset of internal metrics can be useful to monitor system health.
vectorized_cluster_partition_last_stable_offset
Last stable offset.
If this is the last record received by the cluster, then the cluster is up-to-date and ready for maintenance.
vectorized_io_queue_delay
Total delay time in the queue.
Can indicate latency caused by disk operations in seconds.
vectorized_io_queue_queue_length
Number of requests in the queue.
Can indicate latency caused by disk operations.
vectorized_kafka_rpc_active_connections
Number of currently active Kafka RPC connections, or clients.
vectorized_kafka_rpc_connects
Number of accepted Kafka RPC connections.
Compare to the value at a previous time to derive the rate of accepted connections.
vectorized_kafka_rpc_received_bytes
Number of bytes received from Kafka RPC clients in valid requests.
Compare to the value at a previous time to derive the throughput in Kafka layer in bytes/sec received.
vectorized_kafka_rpc_requests_completed
Number of successful Kafka RPC requests.
Compare to the value at a previous time to derive the messages per second per shard.
vectorized_raft_leadership_changes
Number of leadership changes.
High value can indicate nodes failing and causing leadership changes.