Generate a Debug Bundle in Linux

Use rpk or Redpanda Console to generate a debug bundle to diagnose issues yourself, or send it to the Redpanda support team to help resolve your issue.

Prerequisites

You must have rpk installed on your host machine.

Generate a debug bundle with rpk

To generate a debug bundle with rpk, you can run the rpk debug bundle command on each broker in the cluster.

  1. Execute the rpk debug bundle command on a broker:

    rpk debug bundle

    If you have an upload URL from the Redpanda support team, provide it in the --upload-url flag to upload your debug bundle to Redpanda.

    rpk debug bundle \
      --upload-url <url>

    Example output:

    Creating bundle file...
    
    Debug bundle saved to "/var/lib/redpanda/1675440652-bundle.zip"
  2. On your host machine, make a directory in which to save the debug bundle:

    mkdir debug-bundle
  3. Copy the debug bundle ZIP file to the debug-bundle directory on your host machine.

  4. Unzip the file on your host machine.

    cd debug-bundle
    unzip <bundle-name>.zip
  5. Remove the debug bundle from the Redpanda broker:

    rm /var/lib/redpanda/<bundle-name>.zip

When you’ve finished troubleshooting, remove the debug bundle from your host machine:

rm -r debug-bundle

For a description of the files and directories, see Contents of the debug bundle.

Generate a debug bundle with Redpanda Console

Inspect the debug bundle

After downloading the debug bundle files, you can inspect the contents to debug your cluster. This section provides some useful data points to check while troubleshooting.

Most files in the debug bundle are JSON files. To make it easier to read these files, this section uses jq. To install jq, see the jq downloads page.

View the version of Redpanda on all brokers

cat admin/brokers.json | jq '.[] | .version'

Example output:

"v24.3.1"
"v24.3.1"
"v24.3.1"

View the maintenance status of all brokers

cat admin/brokers.json | jq '.[] | .node_id, .maintenance_status'
Example output
0
{
  "draining": false,
  "finished": false,
  "errors": false,
  "partitions": 0,
  "eligible": 0,
  "transferring": 0,
  "failed": 0
}
1
{
  "draining": false,
  "finished": false,
  "errors": false,
  "partitions": 0,
  "eligible": 0,
  "transferring": 0,
  "failed": 0
}
2
{
  "draining": false,
  "finished": false,
  "errors": false,
  "partitions": 0,
  "eligible": 0,
  "transferring": 0,
  "failed": 0
}

View the cluster configuration

cat admin/cluster_config.json | jq
Example output
{
  "abort_index_segment_size": 50000,
  "abort_timed_out_transactions_interval_ms": 10000,
  "admin_api_require_auth": false,
  "aggregate_metrics": false,
  "alter_topic_cfg_timeout_ms": 5000,
  "append_chunk_size": 16384,
  "auto_create_topics_enabled": false,
  "cloud_storage_access_key": null,
  "cloud_storage_api_endpoint": null,
  "cloud_storage_api_endpoint_port": 443,
  "cloud_storage_azure_container": null,
  "cloud_storage_azure_shared_key": null,
  "cloud_storage_azure_storage_account": null,
  "cloud_storage_bucket": null,
  ...
  "target_quota_byte_rate": 2147483648,
  "tm_sync_timeout_ms": 10000,
  "topic_fds_per_partition": 5,
  "topic_memory_per_partition": 1048576,
  "topic_partitions_per_shard": 1000,
  "topic_partitions_reserve_shard0": 2,
  "transaction_coordinator_cleanup_policy": "delete",
  "transaction_coordinator_delete_retention_ms": 604800000,
  "transaction_coordinator_log_segment_size": 1073741824,
  "transactional_id_expiration_ms": 604800000,
  "tx_log_stats_interval_s": 10,
  "tx_timeout_delay_ms": 1000,
  "wait_for_leader_timeout_ms": 5000,
  "zstd_decompress_workspace_bytes": 8388608
}

Check Enterprise Edition license keys

cat admin/license.json | jq
Example output
{
  "loaded": false,
  "license": {
    "format_version": 0,
    "org": "",
    "type": "",
    "expires": 0,
    "sha256": ""
  }
}

View metadata about the Redpanda data directory

To check the size of the directories and look for anomalies:

cat utils/du.txt
Example output
33M	/var/lib/redpanda/data/redpanda/kvstore/0_0
33M	/var/lib/redpanda/data/redpanda/kvstore
33M	/var/lib/redpanda/data/redpanda/controller/0_0
33M	/var/lib/redpanda/data/redpanda/controller
65M	/var/lib/redpanda/data/redpanda
65M	/var/lib/redpanda/data

To check the file permissions, file size, and last modification date of the files:

cat data-dir.txt | jq
Example output
{
  "/var/lib/redpanda/data": {
    "size": "4.096kB",
    "mode": "dgrwxrwxrwx",
    "modified": "2023-02-02 15:21:12.430878371 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/config_cache.yaml": {
    "size": "340B",
    "mode": "-rw-r--r--",
    "modified": "2023-02-02 15:21:22.434878593 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/pid.lock": {
    "size": "2B",
    "mode": "-rw-r--r--",
    "modified": "2023-02-02 15:21:10.502878322 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda": {
    "size": "4.096kB",
    "mode": "dgrwxr-xr-x",
    "modified": "2023-02-02 15:21:10.650878326 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda/controller": {
    "size": "4.096kB",
    "mode": "dgrwxr-xr-x",
    "modified": "2023-02-02 15:21:10.650878326 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda/controller/0_0": {
    "size": "4.096kB",
    "mode": "dgrwxr-xr-x",
    "modified": "2023-02-02 15:21:12.346878368 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda/controller/0_0/0-1-v1.log": {
    "size": "4.096kB",
    "mode": "-rw-r--r--",
    "modified": "2023-02-02 15:21:32.450878771 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda/kvstore": {
    "size": "4.096kB",
    "mode": "dgrwxr-xr-x",
    "modified": "2023-02-02 15:21:10.590878324 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda/kvstore/0_0": {
    "size": "4.096kB",
    "mode": "dgrwxr-xr-x",
    "modified": "2023-02-02 15:21:10.602878325 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/redpanda/kvstore/0_0/0-0-v1.log": {
    "size": "8.192kB",
    "mode": "-rw-r--r--",
    "modified": "2023-02-02 15:21:32.458878772 +0000 UTC",
    "user": "",
    "group": "redpanda"
  },
  "/var/lib/redpanda/data/startup_log": {
    "size": "26B",
    "mode": "-rw-r--r--",
    "modified": "2023-02-02 15:21:10.510878323 +0000 UTC",
    "user": "",
    "group": "redpanda"
  }
}

View cluster metadata

cat kafka.json | jq '.[0]'
Example output
{
  "Name": "metadata",
  "Response": {
    "Cluster": "redpanda.14a3f9b6-1c74-4ffd-806a-4ab48db78120",
    "Controller": 0,
    "Brokers": [
      {
        "NodeID": 0,
        "Port": 9093,
        "Host": "redpanda-0.redpanda.<namespace>.svc.cluster.local.",
        "Rack": null
      },
      {
        "NodeID": 1,
        "Port": 9093,
        "Host": "redpanda-1.redpanda.<namespace>.svc.cluster.local.",
        "Rack": null
      },
      {
        "NodeID": 2,
        "Port": 9093,
        "Host": "redpanda-2.redpanda.<namespace>.svc.cluster.local.",
        "Rack": null
      }
    ],
    "Topics": {}
  },
  "Error": null
}

View topic and broker configurations

cat kafka.json | jq '.[1:]'
Example output
[
  {
    "Name": "topic_configs",
    "Response": null,
    "Error": null
  },
  {
    "Name": "broker_configs",
    "Response": [
      {
        "Name": "0",
        "Configs": [
          {
            "Key": "listeners",
            "Value": "internal://0.0.0.0:9093,default://0.0.0.0:9094",
            "Sensitive": false,
            "Source": "STATIC_BROKER_CONFIG",
            "Synonyms": [
              {
                "Key": "kafka_api",
                "Value": "internal://0.0.0.0:9093,default://0.0.0.0:9094",
                "Source": "STATIC_BROKER_CONFIG"
              },
              {
                "Key": "kafka_api",
                "Value": "plain://127.0.0.1:9092",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "advertised.listeners",
            "Value": "internal://redpanda-0.redpanda.<namespace>.svc.cluster.local.:9093,default://203.0.113.3:31092",
            "Sensitive": false,
            "Source": "STATIC_BROKER_CONFIG",
            "Synonyms": [
              {
                "Key": "advertised_kafka_api",
                "Value": "internal://redpanda-0.redpanda.<namespace>.svc.cluster.local.:9093,default://203.0.113.3:31092",
                "Source": "STATIC_BROKER_CONFIG"
              },
              {
                "Key": "advertised_kafka_api",
                "Value": "",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "log.segment.bytes",
            "Value": "134217728",
            "Sensitive": false,
            "Source": "DEFAULT_CONFIG",
            "Synonyms": [
              {
                "Key": "log_segment_size",
                "Value": "134217728",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "log.retention.bytes",
            "Value": "18446744073709551615",
            "Sensitive": false,
            "Source": "DEFAULT_CONFIG",
            "Synonyms": [
              {
                "Key": "retention_bytes",
                "Value": "18446744073709551615",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "log.retention.ms",
            "Value": "604800000",
            "Sensitive": false,
            "Source": "DEFAULT_CONFIG",
            "Synonyms": [
              {
                "Key": "delete_retention_ms",
                "Value": "604800000",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "num.partitions",
            "Value": "1",
            "Sensitive": false,
            "Source": "DEFAULT_CONFIG",
            "Synonyms": [
              {
                "Key": "default_topic_partitions",
                "Value": "1",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "default.replication.factor",
            "Value": "1",
            "Sensitive": false,
            "Source": "DEFAULT_CONFIG",
            "Synonyms": [
              {
                "Key": "default_topic_replications",
                "Value": "1",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          },
          {
            "Key": "log.dirs",
            "Value": "/var/lib/redpanda/data",
            "Sensitive": false,
            "Source": "STATIC_BROKER_CONFIG",
            "Synonyms": [
              {
                "Key": "data_directory",
                "Value": "/var/lib/redpanda/data",
                "Source": "STATIC_BROKER_CONFIG"
              }
            ]
          },
          {
            "Key": "auto.create.topics.enable",
            "Value": "false",
            "Sensitive": false,
            "Source": "DEFAULT_CONFIG",
            "Synonyms": [
              {
                "Key": "auto_create_topics_enabled",
                "Value": "false",
                "Source": "DEFAULT_CONFIG"
              }
            ]
          }
        ],
        "Err": null
      },
      {
        "Name": "1",
        "Configs": [
          ...
        ]
        ...
      },
      {
        "Name": "1",
        "Configs": [
          ...
        ]
        ...
      },
    ],
    "Error": null
  },
  {
    "Name": "log_start_offsets",
    "Response": {},
    "Error": null
  },
  {
    "Name": "last_stable_offsets",
    "Response": {},
    "Error": null
  },
  {
    "Name": "high_watermarks",
    "Response": {},
    "Error": null
  },
  {
    "Name": "groups",
    "Response": null,
    "Error": null
  }
]

View the Redpanda logs

cat redpanda.log

Check for clock drift

cat utils/ntp.txt | jq

Use the output to check for clock drift. For details about how NTP works, see the NTP documentation.

Example output
{
  "host": "pool.ntp.org",
  "roundTripTimeMs": 3,
  "remoteTimeUTC": "2023-02-02T15:22:51.763175934Z",
  "localTimeUTC": "2023-02-02T15:22:51.698044603Z",
  "precisionMs": 0,
  "offset": -458273
}

Contents of the debug bundle

The debug bundle includes the following files and directories:

File or Directory Description

/admin

Cluster and broker configurations, cluster health data, and license key information.

/controller

Binary-encoded replicated logs that contain the history of configuration changes as well as internal settings.
Redpanda can replay the events that took place in the cluster to arrive at a similar state.

data-dir.txt

Metadata for the Redpanda data directory of the broker on which the rpk debug bundle command was executed.

kafka.json

Kafka metadata, such as broker configuration, topic configuration, offsets, groups, and group commits.

redpanda.log

Redpanda logs for the broker.
If --logs-since is passed, only the logs within the given timeframe are included.

/metrics

Prometheus metrics from both the /metrics endpoint and the public_metrics endpoint.

/proc

CPU details of the broker on which the rpk debug bundle command was executed.
The directory includes a cpuinfo file with CPU information such as processor model, core count, cache size, frequency, as well as an interrupts file that contains IRQ distribution across CPU cores.

redpanda.yaml

The Redpanda configuration file of the broker on which the rpk debug bundle command was executed.
Sensitive data is removed and replaced with (REDACTED).

resource-usage.json

Redpanda resource usage data, such as CPU usage and free memory available.

/utils

Data from the node on which the broker is running. This directory includes:

  • du.txt: The disk usage of the data directory of the broker on which the rpk debug bundle command was executed, as output by the du command.

  • ntp.txt: The NTP clock delta (using ntppool as a reference) and round trip time of the broker on which the rpk debug bundle command was executed.

  • uname.txt: System information, such as the kernel version, hostname, and architecture, as output by the uname command.

  • dig.txt: The DNS resolution information for the node, as output by the dig command.

  • dmidecode.txt: System hardware information from the node, as output by the the dmidecode command. Requires root privileges.

  • free.txt: The amount of free and used memory on the node, as output by the free command.

  • ip.txt: Network interface information, including IP addresses and network configuration, as output by the ip command.

  • lspci.txt: Information about PCI devices on the node, as output by the lspci command.

  • ss.txt: Active socket connections, as output by the ss command, showing network connections, listening ports, and more.

  • sysctl.txt: Kernel parameters of the system, as output by the sysctl command.

  • top.txt: The top processes by CPU and memory usage, as output by the top command.

  • vmstat.txt: Virtual memory statistics, including CPU usage, memory, and IO operations, as output by the vmstat command.

Suggested reading