rpk debug bundle
In Kubernetes, you must run the rpk debug bundle command inside a container that’s running a Redpanda broker.
|
Concept
The rpk debug bundle
command collects environment data that can help debug and diagnose issues with a Redpanda cluster, a broker, or the machine it’s running on. It
then bundles the collected data into a ZIP file, called a diagnostics bundle.
Diagnostic bundle files
The files and directories in the diagnostics bundle differ depending on the environment in which Redpanda is running:
Common files
-
Kafka metadata: Broker configs, topic configs, start/committed/end offsets, groups, group commits.
-
Controller logs: The controller logs directory up to a limit set by --controller-logs-size-limit flag
-
Data directory structure: A file describing the data directory’s contents.
-
redpanda configuration: The redpanda configuration file (
redpanda.yaml
; SASL credentials are stripped). -
/proc/cpuinfo: CPU information like make, core count, cache, frequency.
-
/proc/interrupts: IRQ distribution across CPU cores.
-
Resource usage data: CPU usage percentage, free memory available for the redpanda process.
-
Clock drift: The ntp clock delta (using pool.ntp.org as a reference) and round trip time.
-
Admin API calls: Cluster and broker configurations, cluster health data, CPU profiles, and license key information.
-
Broker metrics: The broker’s Prometheus metrics, fetched through its admin API (/metrics and /public_metrics).
Bare-metal
-
Kernel: The kernel logs ring buffer (syslog) and parameters (sysctl).
-
DNS: The DNS info as reported by 'dig', using the hosts in /etc/resolv.conf.
-
Disk usage: The disk usage for the data directory, as output by 'du'.
-
Redpanda logs: The broker’s Redpanda logs written to
journald
sinceyesterday
(00:00:00 of the previous day based onsystemd.time
). If--logs-since
or--logs-until
is passed, only the logs within the resulting time frame are included. -
Socket info: The active sockets data output by 'ss'.
-
Running process info: As reported by 'top'.
-
Virtual memory stats: As reported by 'vmstat'.
-
Network config: As reported by 'ip addr'.
-
lspci: List the PCI buses and the devices connected to them.
-
dmidecode: The DMI table contents. Only included if this command is run as root.
Extra requests for partitions
You can provide a list of partitions to save additional admin API requests specifically for those partitions.
The partition flag accepts the format <namespace>/[topic]/[partitions…]
where the namespace is optional, if the namespace is not provided, rpk
will assume 'kafka'. For example:
Topic 'foo', partitions 1, 2 and 3:
--partitions foo/1,2,3
Namespace _redpanda-internal, topic 'bar', partition 2
--partitions _redpanda-internal/bar/2
If you have an upload URL from the Redpanda support team, provide it in the --upload-url flag to upload your diagnostics bundle to Redpanda.
Kubernetes
-
Kubernetes Resources: Kubernetes manifests for all resources in the given Kubernetes namespace using
--namespace
, or the shorthand version-n
. -
redpanda logs: Logs of each Pod in the given Kubernetes namespace. If
--logs-since
is passed, only the logs within the given timeframe are included.
Flags
Value | Type | Description |
---|---|---|
|
string |
Sets the limit of the controller
log size that can be stored in the bundle. Multipliers are also
supported, e.g. 3MB, 1GiB (default |
|
duration |
Specifies the duration for collecting samples for the CPU profiler (for example, 30s, 1.5m). Must be higher than 15s (default |
|
- |
Display documentation for |
|
stringArray |
Comma-separated label selectors to filter your resources. e.g: <label>=<value>,<label>=<value> (k8s only) (default ` [app.kubernetes.io/name=redpanda]`). |
|
string |
Include logs dated from specified date onward. This flag accepts a |
|
string |
Read the logs until the given size is
reached. Multipliers are also supported, e.g. 3MB, 1GiB (default
|
|
string |
Include logs older than the specified date. This flag accepts a |
|
duration |
The amount of time to wait before
capturing the second snapshot of the metrics endpoints, for example
|
|
int |
Number of metrics samples to take (at the interval of |
|
string |
The Kubernetes namespace in which the Redpanda
cluster is running. Default: |
|
string |
The file path where the debug file will be
written (default |
|
stringArray |
Comma-separated partition IDs; when provided, |
|
duration |
The amount of time to wait for child commands to
execute, for example |
|
string |
If provided, where to upload the bundle in addition to creating a copy on disk. |
|
string |
Redpanda or |
|
stringArray |
Override |
|
string |
Profile to use. See |
|
- |
Enable verbose logging. |
Result
The files and directories in the diagnostics bundle differ depending on the environment in which Redpanda is running.
-
Linux
-
Kubernetes
File or Directory | Description |
---|---|
|
Cluster and broker configurations, cluster health data, and license key information. |
|
Binary-encoded replicated logs that contain the history of configuration changes as well as internal settings. |
|
Metadata for the Redpanda data directory of the broker on which the |
|
Kafka metadata, such as broker configuration, topic configuration, offsets, groups, and group commits. |
|
Redpanda logs for the broker. |
|
Prometheus metrics from both the |
|
CPU details of the broker on which the |
|
The Redpanda configuration file of the broker on which the |
|
Redpanda resource usage data, such as CPU usage and free memory available. |
|
Data from the node on which the broker is running. This directory includes:
|
File or Directory | Description |
---|---|
|
Cluster and broker configurations, cluster health data, and license key information. |
|
Binary-encoded replicated logs that contain the history of configuration changes as well as internal settings. |
|
Metadata for the Redpanda data directory of the broker on which the |
|
Kafka metadata, such as broker configuration, topic configuration, offsets, groups, and group commits. |
|
Redpanda logs for the broker. |
|
Prometheus metrics from both the |
|
CPU details of the broker on which the |
|
The Redpanda configuration file of the broker on which the |
|
Redpanda resource usage data, such as CPU usage and free memory available. |
|
Data from the node on which the broker is running. This directory includes:
|