# Internal Metrics

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [streaming-full.txt](https://docs.redpanda.com/streaming-full.txt)

---
title: Internal Metrics
latest-operator-version: v26.1.4
# EOL = End-of-Life (support lifecycle status)
page-is-nearing-eol: "false"
page-is-past-eol: "true"
page-eol-date: December 22, 2024
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
docname: internal-metrics
page-component-name: streaming
page-version: "23.3"
page-component-version: "23.3"
page-component-title: Streaming
page-relative-src-path: internal-metrics.adoc
page-edit-url: https://github.com/redpanda-data/docs/edit/v/23.3/modules/reference/pages/internal-metrics.adoc
description: This page describes the original metrics provided for internal support, troubleshooting, and backward compatibility.
page-git-created-date: "2023-05-17"
page-git-modified-date: "2023-08-10"
support-status: past end-of-life
---

<!-- Source: https://docs.redpanda.com/streaming/23.3/reference/internal-metrics.md -->

This page describes the original metrics available on `<node ip>:9644/metrics`. These are provided for internal support, troubleshooting, and backward compatibility.

For information about the latest metrics available on `:9644/public_metrics`, see [Monitoring](https://docs.redpanda.com/streaming/23.3/manage/monitoring/).

## [](#configure-prometheus-internal-metrics)Configure Prometheus: internal metrics

[Prometheus](https://prometheus.io/) is a systems monitoring and alerting tool. It collects and stores metrics as time-series data identified by metric name and key/value pairs.

Redpanda exports Prometheus metrics on `:9644/public_metrics`, which has been available since v22.2. To generate the Redpanda configuration on an existing Prometheus instance for both the `/public_metrics` endpoint as well as the `/metrics` endpoint, run:

```bash
rpk generate prometheus-config --job-name redpanda-metrics-test --node-addrs 'localhost:9644' --internal-metrics
```

The output is a YAML object that you can add to the `scrape_configs` list.

```yaml
- job_name: redpanda-node
  static_configs:
  - targets:
    - 172.31.18.239:9644
    - 172.31.18.238:9643
    - 172.31.18.237:9642
```

Edit the `prometheus.yml` file in the Prometheus root folder to add the Redpanda configuration under the `scrape_configs`:

```yaml
scrape_configs:
…
  - job_name: redpanda
    static_configs:
      - targets:
        - 172.31.18.239:9644
        - 172.31.18.238:9643
        - 172.31.18.237:9642
…
```

The number of targets may change depending on the total number of running nodes.

Save the configuration file, and restart Prometheus to apply changes.

> 📝 **NOTE**
>
> -   You can use this [sample Prometheus configuration file](https://github.com/prometheus/prometheus/blob/main/documentation/examples/prometheus.yml).
>
> -   See rpk details in the [rpk command reference](https://docs.redpanda.com/streaming/23.3/reference/rpk/rpk-generate/rpk-generate-prometheus-config/) for Prometheus.

## [](#configure-grafana-internal-metrics)Configure Grafana: internal metrics

Now that you have the metrics, you need a tool to query, visualize, and generate alerts. [Grafana](https://grafana.com/oss/grafana/) talks to Prometheus and lets you create dashboards with multiple graphic components.

To generate a comprehensive Grafana dashboard, run:

```bash
rpk generate grafana-dashboard --datasource <name> --metrics-endpoint <url>
```

where:

-   `<name>` is the name of the Prometheus data source configured in your Grafana instance.

-   `<url>` is the address to a Redpanda node’s metrics endpoint. The default is `<node ip>:9644/metrics`.


See details in the [rpk command reference](https://docs.redpanda.com/streaming/23.3/reference/rpk/rpk-generate/rpk-generate-grafana-dashboard/) for Grafana.

Out of the box, Grafana generates panels tracking latency for 50%, 95%, and 99% (based on the maximum latency set), throughput, and error segmentation by type.

Pipe the command’s output to a file, and import it into Grafana.

```bash
rpk generate grafana-dashboard \
  --datasource prometheus \
  --metrics-endpoint 172.31.18.237:9642/metrics > redpanda-dashboard.json
```

In Grafana, import this generated JSON file. The default Grafana dashboard shows information about available Redpanda nodes, partitions, latency, and throughput graphics.

You can use the imported dashboard to create new panels.

1.  Click **+** in the left pane, and select **Add a new panel**.

2.  On the **Query** tab, select **Prometheus** data source.

3.  Decide which metric you want to monitor, click **Metrics browser**, and start typing `vectorized` to show available metrics from the Redpanda cluster.


Another way to see available metrics with their description is to access the cluster using a web browser on port 9644. (The port can change depending on your configuration.) You can dynamically set queries to calculate the values in the format you want.

For example, to display the total number of partitions in your cluster, run this query on Grafana:

```promql
count(count by (topic,partition)
 (vectorized_storage_log_partition_size{namespace="kafka"}))
```

To show the number of partitions for a specific topic, run:

```promql
count(count by (topic,partition)
 (vectorized_storage_log_partition_size{topic="<topic_name>"}))
```

## [](#monitor-internal-metrics)Monitor internal metrics

Most metrics are useful for debugging, but the following metrics can be useful to measure system health:

| Metric | Definition | Diagnostics |
| --- | --- | --- |
| vectorized_application_uptime | Redpanda uptime in milliseconds |  |
| vectorized_cluster_partition_last_stable_offset | Last stable offset | If this is the last record received by the cluster, then the cluster is up-to-date and ready for maintenance |
| vectorized_io_queue_delay | Total delay time in the queue | Can indicate latency caused by disk operations in seconds |
| vectorized_io_queue_queue_length | Number of requests in the queue | Can indicate latency caused by disk operations |
| vectorized_kafka_rpc_active_connections | kafka_rpc: Currently active connections | Shows the number of clients actively connected |
| vectorized_kafka_rpc_connects | kafka_rpc: Number of accepted connections | Compare to the value at a previous time to derive the rate of accepted connections |
| vectorized_kafka_rpc_received_bytes | kafka_rpc: Number of bytes received from the clients in valid requests | Compare to the value at a previous time to derive the throughput in Kafka layer in bytes/sec received |
| vectorized_kafka_rpc_requests_completed | kafka_rpc: Number of successful requests | Compare to the value at a previous time to derive the messages per second per shard |
| vectorized_kafka_rpc_requests_pending | kafka_rpc: Number of requests being processed by server |  |
| vectorized_kafka_rpc_sent_bytes | kafka_rpc: Number of bytes sent to clients |  |
| vectorized_kafka_rpc_service_errors | kafka_rpc: Number of service errors |  |
| vectorized_raft_leadership_changes | Number of leadership changes | High value can indicate nodes failing and causing leadership changes |
| vectorized_reactor_utilization | CPU utilization | Shows the true utilization of the CPU by Redpanda process |
| vectorized_storage_log_compacted_segment | Number of compacted segments |  |
| vectorized_storage_log_log_segments_created | Number of created log segments |  |
| vectorized_storage_log_partition_size | Current size of partition in bytes |  |
| vectorized_storage_log_read_bytes | Total number of bytes read |  |
| vectorized_storage_log_written_bytes | Total number of bytes written |  |