# Public Metrics Reference

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [streaming-full.txt](https://docs.redpanda.com/streaming-full.txt)

---
title: Public Metrics Reference
latest-operator-version: v26.1.4
# EOL = End-of-Life (support lifecycle status)
page-is-nearing-eol: "false"
page-is-past-eol: "true"
page-eol-date: December 22, 2024
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
docname: public-metrics-reference
page-component-name: streaming
page-version: "23.3"
page-component-version: "23.3"
page-component-title: Streaming
page-relative-src-path: public-metrics-reference.adoc
page-edit-url: https://github.com/redpanda-data/docs/edit/v/23.3/modules/reference/pages/public-metrics-reference.adoc
description: Reference of public metrics to create your system dashboard.
page-git-created-date: "2023-05-30"
page-git-modified-date: "2024-03-26"
support-status: past end-of-life
---

<!-- Source: https://docs.redpanda.com/streaming/23.3/reference/public-metrics-reference.md -->

This section provides reference descriptions about the public metrics exported from Redpanda’s `/public_metrics` endpoint.

> 💡 **TIP**
>
> Use [/public\_metrics](./) for your primary dashboards for system health.
>
> Use [/metrics](https://docs.redpanda.com/streaming/23.3/reference/internal-metrics-reference/) for detailed analysis and debugging.

> ❗ **IMPORTANT**
>
> In a live system, Redpanda metrics are exported only for features that are in use. For example, a metric for consumer groups is not exported when no groups are registered.
>
> To see the available public metrics in your system, query the `/public_metrics` endpoint:
>
> ```bash
> curl http://<node-addr>:9644/public_metrics | grep "[HELP|TYPE]"
> ```

## [](#cluster-metrics)Cluster metrics

### [](#redpanda_cluster_brokers)redpanda_cluster_brokers

Number of configured, fully commissioned brokers in a cluster.

**Type**: gauge

**How to monitor**: create an alert for when this gauge dips below a steady-state threshold, as a node(s) has become unresponsive.

* * *

### [](#redpanda_cluster_controller_log_limit_requests_available_rps)redpanda_cluster_controller_log_limit_requests_available_rps

Limit on the available requests per second (RPS) for a cluster controller log.

**Type**: gauge

**Labels**:

-   `redpanda_cmd_group=("move_operations" | "topic_operations" | "configuration_operations" | "node_management_operations" | "acls_and_users_operations")`


* * *

### [](#redpanda_cluster_controller_log_limit_requests_dropped)redpanda_cluster_controller_log_limit_requests_dropped

Number of requests dropped by a cluster controller log due to exceeding [redpanda\_cluster\_controller\_log\_limit\_requests\_available\_rps](#redpanda_cluster_controller_log_limit_requests_available_rps).

**Type**: counter

**Labels**:

-   `redpanda_cmd_group=("move_operations" | "topic_operations" | "configuration_operations" | "node_management_operations" | "acls_and_users_operations")`


**Usage**: when this counter increases, it indicates that the controller is dropping requests.

* * *

### [](#redpanda_cluster_partition_moving_from_node)redpanda_cluster_partition_moving_from_node

Number of partition replicas in the cluster that are currently being removed from a node.

**Type**: gauge

**Usage**: when this gauge is non-zero, determine whether there is an expected or unexpected reassignment of partitions that is causing movement of partition replicas.

* * *

### [](#redpanda_cluster_partition_moving_to_node)redpanda_cluster_partition_moving_to_node

Number of partition replicas in the cluster that are currently being added or moved to a node.

**Type**: gauge

**Usage**: when this gauge is non-zero, determine whether there is an expected or unexpected reassignment of partitions that is causing movement of partition replicas.

* * *

### [](#redpanda_cluster_partition_node_cancelling_movements)redpanda_cluster_partition_node_cancelling_movements

During a partition movement cancellation operation, the number of partition replicas that were being moved that now need to be cancelled.

**Type**: gauge

**How to monitor**: when this gauge is non-zero, determine whether there is an expected or unexpected reassignment of partitions that is causing movement of partition replicas.

* * *

### [](#redpanda_cluster_partitions)redpanda_cluster_partitions

Number of partitions managed by a cluster. Includes partitions of the controller topic, but not replicas.

**Type**: gauge

* * *

### [](#redpanda_cluster_topics)redpanda_cluster_topics

Number of topics in a cluster.

**Type**: gauge

* * *

### [](#redpanda_cluster_unavailable_partitions)redpanda_cluster_unavailable_partitions

Number of unavailable partitions (the partitions that lack quorum among their replica group) in the cluster.

**Type**: gauge

**Usage**: when this gauge is non-zero, it indicates that a partition(s) doesn’t have quorum and thus doesn’t have an active leader. To mitigate, consider increasing the number of brokers and the partition replication factor.

* * *

## [](#infrastructure-metrics)Infrastructure metrics

### [](#redpanda_cpu_busy_seconds_total)redpanda_cpu_busy_seconds_total

Total CPU busy time in seconds.

**Type**: counter

* * *

### [](#redpanda_io_queue_total_read_ops)redpanda_io_queue_total_read_ops

Total read operations passed in the I/O queue.

**Type**: counter

**Labels**:

-   `class=("default" | "compaction" | "raft")`

-   `ioshard`

-   `mountpoint`

-   `shard`


* * *

### [](#redpanda_io_queue_total_write_ops)redpanda_io_queue_total_write_ops

Total write operations passed in the I/O queue.

**Type**: counter

**Labels**:

-   `class=("default" | "compaction" | "raft")`

-   `ioshard`

-   `mountpoint`

-   `shard`


* * *

### [](#redpanda_memory_allocated_memory)redpanda_memory_allocated_memory

Total allocated memory in bytes.

**Type**: gauge

**Labels**:

-   `shard`


* * *

### [](#redpanda_memory_available_memory)redpanda_memory_available_memory

Total memory potentially available (free plus reclaimable memory) to a CPU shard (core), in bytes.

**Type**: gauge

**Labels**:

-   `shard`


* * *

### [](#redpanda_memory_available_memory_low_water_mark)redpanda_memory_available_memory_low_water_mark

The low watermark for available memory at process start.

**Type**: gauge

**Labels**:

-   `shard`


* * *

### [](#redpanda_memory_free_memory)redpanda_memory_free_memory

Available memory in bytes.

**Type**: gauge

**Labels**:

-   `shard`


* * *

### [](#redpanda_rpc_active_connections)redpanda_rpc_active_connections

The total number of currently-active clients the RPC server on a given shard has connections to.

**Type**: gauge

**Labels**:

-   `redpanda_server=("kafka" | "internal")`


* * *

### [](#redpanda_rpc_request_errors_total)redpanda_rpc_request_errors_total

Number of RPC errors.

**Type**: counter

**Labels**:

-   `redpanda_server=("kafka" | "internal")`


**Usage**: when this counter increases, analyze the logged errors.

* * *

### [](#redpanda_rpc_request_latency_seconds)redpanda_rpc_request_latency_seconds

RPC latency in seconds.

**Type**: histogram

**Labels**:

-   `redpanda_server=("kafka" | "internal")`


* * *

### [](#redpanda_scheduler_runtime_seconds_total)redpanda_scheduler_runtime_seconds_total

Accumulated runtime of the task queue associated with a scheduling group.

**Type**: counter

**Labels**:

-   `redpanda_scheduling_group=("admin" | "archival_upload" | "cache_background_reclaim" | "cluster" | "coproc" | "kafka" | "log_compaction" | "main" | "node_status" | "raft" | "raft_learner_recovery")`

-   `shard`


* * *

### [](#redpanda_storage_disk_free_bytes)redpanda_storage_disk_free_bytes

Available disk storage in bytes.

**Type**: gauge

* * *

### [](#redpanda_storage_disk_free_space_alert)redpanda_storage_disk_free_space_alert

Alert for low disk storage: `0-OK`, `1-low space`, `2-degraded`.

**Type**: gauge

* * *

### [](#redpanda_storage_disk_total_bytes)redpanda_storage_disk_total_bytes

Total size in bytes of attached storage.

**Type**: gauge

* * *

### [](#redpanda_uptime_seconds_total)redpanda_uptime_seconds_total

Total CPU runtime (uptime) in seconds.

**Type**: gauge

* * *

## [](#raft-metrics)Raft metrics

### [](#redpanda_node_status_rpcs_received)redpanda_node_status_rpcs_received

Number of node status RPCs received by a node.

**Type**: gauge

* * *

### [](#redpanda_node_status_rpcs_sent)redpanda_node_status_rpcs_sent

Number of node status RPCs sent by a node.

**Type**: gauge

* * *

### [](#redpanda_node_status_rpcs_timed_out)redpanda_node_status_rpcs_timed_out

Number of timed out node status RPCs from a node.

**Type**: gauge

* * *

## [](#service-metrics)Service metrics

### [](#redpanda_pandaproxy_request_latency_seconds)redpanda_pandaproxy_request_latency_seconds

Latency in seconds of the request indicated by the label in HTTP Proxy. The measurement includes the time spent waiting for resources to become available, processing the request, and dispatching the response.

**Type**: histogram

* * *

### [](#redpanda_schema_registry_request_errors_total)redpanda_schema_registry_request_errors_total

Total number of Schema Registry errors.

**Type**: counter

**Labels**:

-   `redpanda_status=("5xx" | "4xx" | "3xx")`


* * *

### [](#redpanda_schema_registry_request_latency_seconds)redpanda_schema_registry_request_latency_seconds

Latency of the request indicated by the label in the Schema Registry. The measurement includes the time spent waiting for resources to become available, processing the request, and dispatching the response.

**Type**: histogram

* * *

## [](#partition-metrics)Partition metrics

### [](#redpanda_kafka_max_offset)redpanda_kafka_max_offset

The high watermark offset of a partition. Suitable for calculating consumer group lag.

**Type**: gauge

**Labels**:

-   `redpanda_namespace`

-   `redpanda_partition`

-   `redpanda_topic`


**Related topics**:

-   [Consumer group lag](https://docs.redpanda.com/streaming/23.3/manage/monitoring/#consumer-group-lag)


* * *

### [](#redpanda_kafka_under_replicated_replicas)redpanda_kafka_under_replicated_replicas

Number of replicas in the partition that are live but not at the latest offset, [redpanda\_kafka\_max\_offset](#redpanda_kafka_max_offset).

**Type**: gauge

**Labels**:

-   `redpanda_namespace`

-   `redpanda_partition`

-   `redpanda_topic`


* * *

### [](#redpanda_raft_recovery_partition_movement_available_bandwidth)redpanda_raft_recovery_partition_movement_available_bandwidth

Bandwidth available for partition movement, in bytes per sec.

**Type**: gauge

**Labels**:

-   `shard`


* * *

## [](#topic-metrics)Topic metrics

* * *

### [](#redpanda_cluster_partition_schema_id_validation_records_failed)redpanda_cluster_partition_schema_id_validation_records_failed

Number of records that failed schema ID validation.

**Type**: counter

* * *

### [](#redpanda_kafka_partitions)redpanda_kafka_partitions

Configured number of partitions for a topic.

**Type**: gauge

**Labels**: -`redpanda_namespace` -`redpanda_topic`

* * *

### [](#redpanda_kafka_records_fetched_total)redpanda_kafka_records_fetched_total

Total number of records fetched.

**Type**: counter

* * *

### [](#redpanda_kafka_records_produced_total)redpanda_kafka_records_produced_total

Total number of records produced.

**Type**: counter

* * *

### [](#redpanda_kafka_replicas)redpanda_kafka_replicas

The number of configured replicas per topic.

**Type**: gauge

**Labels**:

-   `redpanda_namespace`

-   `redpanda_topic`


* * *

### [](#redpanda_kafka_request_bytes_total)redpanda_kafka_request_bytes_total

Total number of bytes produced or consumed per topic.

**Type**: counter

**Labels**:

-   `redpanda_namespace`

-   `redpanda_topic`

-   `redpanda_request=("produce" | "consume")`


* * *

### [](#redpanda_raft_leadership_changes)redpanda_raft_leadership_changes

Number of leadership changes across all partitions of a given topic.

**Type**: counter

**Labels**:

-   `redpanda_namespace`

-   `redpanda_topic`


* * *

## [](#broker-metrics)Broker metrics

### [](#redpanda_kafka_request_latency_seconds)redpanda_kafka_request_latency_seconds

Latency of produce/consume requests per broker. This duration measures from when a request is initiated on the partition to when the response is fulfilled.

**Type**: histogram

**Labels**:

-   `redpanda_request=("produce" | "consume")`


* * *

## [](#consumer-group-metrics)Consumer group metrics

### [](#redpanda_kafka_consumer_group_committed_offset)redpanda_kafka_consumer_group_committed_offset

Committed offset of a consumer group.

**Type**: gauge

**Labels**:

-   `group`

-   `topic`

-   `partition`


* * *

### [](#redpanda_kafka_consumer_group_consumers)redpanda_kafka_consumer_group_consumers

Number of consumers in a consumer group.

**Type**: gauge

**Labels**:

-   `group`


* * *

### [](#redpanda_kafka_consumer_group_topics)redpanda_kafka_consumer_group_topics

Number of topics in a consumer group.

**Type**: gauge

**Labels**:

-   `group`


* * *

## [](#rest-proxy-metrics)REST proxy metrics

### [](#redpanda_rest_proxy_request_errors_total)redpanda_rest_proxy_request_errors_total

Total number of REST proxy server errors.

**Type**: counter

**Labels**:

-   `redpanda_status("5xx" | "4xx" | "3xx")`


* * *

### [](#redpanda_rest_proxy_request_latency_seconds_bucket)redpanda_rest_proxy_request_latency_seconds_bucket

Internal latency of REST proxy requests.

**Type**: histogram

* * *

## [](#application-metrics)Application metrics

### [](#redpanda_application_build)redpanda_application_build

Redpanda build information.

**Type**: gauge

**Labels**:

-   `redpanda_revision=<redpanda-revision-ID>`

-   `redpanda_version=<redpanda-version-number>`


* * *

### [](#redpanda_application_uptime_seconds_total)redpanda_application_uptime_seconds_total

Redpanda application uptime in seconds.

**Type** gauge

* * *

## [](#cloud-metrics)Cloud metrics

### [](#redpanda_cloud_client_backoff)redpanda_cloud_client_backoff

Total number of requests that backed off.

**Type**: counter

**Labels**:

-   S3

    -   `redpanda_endpoint`

    -   `redpanda_region`


-   Azure Blob Storage (ABS)

    -   `redpanda_endpoint`

    -   `redpanda_storage_account`


* * *

### [](#redpanda_cloud_client_download_backoff)redpanda_cloud_client_download_backoff

Total number of download requests that backed off.

**Type**: counter

**Labels**:

-   S3

    -   `redpanda_endpoint`

    -   `redpanda_region`


-   Azure Blob Storage (ABS)

    -   `redpanda_endpoint`

    -   `redpanda_storage_account`


* * *

### [](#redpanda_cloud_client_downloads)redpanda_cloud_client_downloads

Total number of requests that downloaded an object from cloud storage.

**Type**: counter

**Labels**:

-   S3

    -   `redpanda_endpoint`

    -   `redpanda_region`


-   Azure Blob Storage (ABS)

    -   `redpanda_endpoint`

    -   `redpanda_storage_account`


* * *

### [](#redpanda_cloud_client_not_found)redpanda_cloud_client_not_found

Total number of requests for which the object was not found.

**Type**: counter

**Labels**:

-   S3

    -   `redpanda_endpoint`

    -   `redpanda_region`


-   Azure Blob Storage (ABS)

    -   `redpanda_endpoint`

    -   `redpanda_storage_account`


* * *

### [](#redpanda_cloud_client_upload_backoff)redpanda_cloud_client_upload_backoff

Total number of upload requests that backed off.

**Type**: counter

**Labels**:

-   S3

    -   `redpanda_endpoint`

    -   `redpanda_region`


-   Azure Blob Storage (ABS)

    -   `redpanda_endpoint`

    -   `redpanda_storage_account`


* * *

### [](#redpanda_cloud_client_uploads)redpanda_cloud_client_uploads

Total number of requests that uploaded an object to cloud storage.

**Type**: counter

**Labels**:

-   S3

    -   `redpanda_endpoint`

    -   `redpanda_region`


-   Azure Blob Storage (ABS)

    -   `redpanda_endpoint`

    -   `redpanda_storage_account`


* * *

## [](#tls_metrics)TLS metrics

### [](#redpanda_tls_truststore_expires_at_timestamp_seconds)redpanda_tls_truststore_expires_at_timestamp_seconds

The expiration time, represented as a Unix-style timestamp, of the shortest-lived certificate authority (CA) in the installed CA chain. This lists all resources that have at least one CA configured.

**Type**: gauge

**Labels**:

-   `area`

-   `detail`


**Usage**: This metric is recorded for all resources that may have a TLS certificate installed. The `area` and `detail` labels provide a hierarchical way of identifying a single resource you need information on.

* * *

### [](#redpanda_tls_certificate_expires_at_timestamp_seconds)redpanda_tls_certificate_expires_at_timestamp_seconds

The expiration time, represented as a Unix-style timestamp, for the shortest-lived installed certificate. While more than one certificate may be installed for a resource, this provides only the timestamp of the next certificate to expire. This lists all resources that have at least one certificate installed.

**Type**: gauge

**Labels**:

-   `area`

-   `detail`


**Usage**: This metric is recorded for all resources that may have a TLS certificate installed. The `area` and `detail` labels provide a hierarchical way of identifying a single resource you need information on.

* * *

### [](#redpanda_tls_certificate_serial)redpanda_tls_certificate_serial

The serial number of the shortest-lived installed certificate. These serial numbers are technically unbounded. As such, this will return the least significant 4B digits of the integer representation certificate’s serial number. While more than one certificate may be installed for a resource, this provides only the serial number of the next certificate to expire. This lists all resources that have at least one certificate installed.

**Type**: gauge

**Labels**:

-   `area`

-   `detail`


**Usage**: This metric is recorded for all resources that may have a TLS certificate installed. The `area` and `detail` labels provide a hierarchical way of identifying a single resource you need information on.

* * *

### [](#redpanda_tls_loaded_at_timestamp_seconds)redpanda_tls_loaded_at_timestamp_seconds

The last time, represented as a Unix-style timestamp, that at least one TLS certificate was loaded for a resource. This lists all resources that have at least one certificate installed.

**Type**: gauge

**Labels**:

-   `area`

-   `detail`


**Usage**: This metric is recorded for all resources that may have a TLS certificate installed. The `area` and `detail` labels provide a hierarchical way of identifying a single resource you need information on.

* * *

### [](#redpanda_tls_certificate_valid)redpanda_tls_certificate_valid

Indicates whether a given resource has at least one valid installed TLS certificate. Querying this will provide a check on all resources with at least one certificate installed. The value of this gauge is either `1` to indicate presence of a valid certificate or `0` to indicate the absence of a valid certificate.

**Type**: gauge

**Labels**:

-   `area`

-   `detail`


**Usage**: This metric is recorded for all resources that may have a TLS certificate installed. The `area` and `detail` labels provide a hierarchical way of identifying a single resource you need information on.

* * *

## [](#data_transform_metrics)Data transforms metrics

### [](#redpanda_transform_execution_latency_sec)redpanda_transform_execution_latency_sec

A histogram of the latency of transforming a single record, in seconds.

**Type**: histogram

**Labels**:

-   `function_name`


* * *

### [](#redpanda_transform_execution_errors)redpanda_transform_execution_errors

Data transform invocation errors.

**Type**: counter

**Labels**:

-   `function_name`


* * *

### [](#redpanda_wasm_engine_cpu_seconds_total)redpanda_wasm_engine_cpu_seconds_total

Total CPU time spent inside a WebAssembly function, in seconds.

**Type**: counter

**Labels**:

-   `function_name`


* * *

### [](#redpanda_wasm_engine_memory_usage)redpanda_wasm_engine_memory_usage

Amount of memory usage for a WebAssembly function.

**Type**: gauge

**Labels**:

-   `function_name`


* * *

### [](#redpanda_wasm_engine_max_memory)redpanda_wasm_engine_max_memory

Maximum amount of memory for a WebAssembly function.

**Type**: gauge

**Labels**:

-   `function_name`


* * *

### [](#redpanda_wasm_binary_executable_memory_usage)redpanda_wasm_binary_executable_memory_usage

The amount of executable memory usage for WebAssembly binaries.

**Type**: gauge

* * *

### [](#redpanda_transform_read_bytes)redpanda_transform_read_bytes

A counter for all the bytes that has been input to a transform.

**Type**: counter

**Labels**:

-   `function_name`


* * *

### [](#redpanda_transform_write_bytes)redpanda_transform_write_bytes

A counter for all the bytes that has been output from a transform.

**Type**: counter

**Labels**:

-   `function_name`


* * *

### [](#redpanda_transform_processor_lag)redpanda_transform_processor_lag

The amount of pending records on the input topic that have yet to be processed by the transform.

**Type**: gauge

**Labels**:

-   `function_name`


* * *

### [](#redpanda_transform_failures)redpanda_transform_failures

A counter for each time that a processor encounters a failure.

**Type**: counter

**Labels**:

-   `function_name`


* * *

### [](#redpanda_transform_state)redpanda_transform_state

The number of transform processors in a specific state (running, inactive, errored).

**Type**: gauge

**Labels**:

-   `function_name`

-   `state=("running" | "inactive" | "errored")`


* * *

## [](#related-topics)Related topics

-   [Learn how to monitor Redpanda](https://docs.redpanda.com/streaming/23.3/manage/monitoring/)

-   [Internal metrics reference](https://docs.redpanda.com/streaming/23.3/reference/internal-metrics-reference/)