Internal Metrics Reference
This section provides reference descriptions about the internal metrics exported from Redpanda's
Use /public_metrics for your primary dashboards for system health.
Use /metrics for detailed analysis and debugging.
In a live system, Redpanda metrics are exported only for features that are in use. For example, a metric for consumer groups is not exported when no groups are registered.
To see the available internal metrics in your system, query the
curl <node-addr>:9644/metrics | grep "[HELP|TYPE]"
Most internal metrics are useful for debugging. The following subset of internal metrics can be useful to monitor system health.
Redpanda uptime in milliseconds.
Last stable offset.
If this is the last record received by the cluster, then the cluster is up-to-date and ready for maintenance.
Total delay time in the queue.
Can indicate latency caused by disk operations in seconds.
Number of requests in the queue.
Can indicate latency caused by disk operations.
Number of currently active Kafka RPC connections, or clients.
Number of accepted Kafka RPC connections.
Compare to the value at a previous time to derive the rate of accepted connections.
Number of bytes received from Kafka RPC clients in valid requests.
Compare to the value at a previous time to derive the throughput in Kafka layer in bytes/sec received.
Number of successful Kafka RPC requests.
Compare to the value at a previous time to derive the messages per second per shard.
Number of Kafka RPC requests being processed by a server.
Number of bytes sent to Kafka RPC clients.
Number of Kafka RPC service errors.
Number of leadership changes.
High value can indicate nodes failing and causing leadership changes.
Shows the true utilization of the CPU by Redpanda process.
Number of compacted segments.
Number of created log segments.
Current size of partition in bytes.
Total number of bytes read.
Total number of bytes written.