Cloud

Troubleshoot Query Out-of-Memory Errors

If a Redpanda SQL query exhausts the memory available to it, the engine cancels the query and returns an error to the client:

ERROR:  Query Out of Memory!

Canceling the query frees its memory and allows the engine to continue serving other queries. This is a normal protection mechanism and is not a sign of cluster failure.

Use this page to:

Identify when a query was canceled because it ran out of memory
Recover from a query out-of-memory error and reduce its frequency
Monitor node memory usage to anticipate memory pressure

How Redpanda SQL uses memory

Redpanda SQL queries can read very large input sources (many terabytes), and simple operations such as filtering or projection process input incrementally with a small, roughly constant memory footprint. In those cases the engine can process far more data than fits in RAM.

Memory pressure comes from operations that materialize intermediate state: hash tables for JOIN and GROUP BY, heaps for ORDER BY and top-K, and network buffers between nodes. All of this intermediate state, along with the final result set, must fit into the aggregate memory available across the cluster. The engine does not spill intermediate state to disk, so a query that builds intermediate structures larger than available memory is canceled rather than slowed.

All concurrently running queries contribute to total memory consumption, so a single query can hit the node memory limit because of pressure from other queries running at the same time.

Recover from the error

When a single query fails with Query out of Memory, retry it. The error frees the query’s memory, so the next attempt often succeeds, especially if other concurrent queries have completed in the meantime.

If the same query keeps failing, the query itself is too memory-hungry for the current cluster size, or too many other queries are competing for memory at the same time. Reduce the query’s memory footprint or reduce concurrent load:

Reduce concurrency.

Run fewer queries in parallel against the cluster. Other queries running at the same time contribute to the total memory pressure.
Simplify the query.

Narrow the scan range with tighter WHERE filters, reduce the number of joins, or break a large aggregation into smaller ones. Operations that materialize wide intermediate results (joins, sorts, distinct aggregations) drive memory consumption the most.
Scale the cluster.

Add SQL nodes to increase the aggregate memory available to queries. See Scale Redpanda SQL.

Monitor memory usage

Use the following Prometheus gauge to track memory consumed by the SQL workload and watch for sustained growth toward the node’s limit:

Metric Description

Metric	Description
`query_memory_consumption_total`	Memory consumed by queries on the node, in bytes. Use this metric to monitor workload memory usage. Unlike `oxla_process_memory_total`, which includes cached files and can sit near the limit even when the node is idle, `query_memory_consumption_total` reflects only memory attributed to query execution.

query_memory_consumption_total

Memory consumed by queries on the node, in bytes. Use this metric to monitor workload memory usage. Unlike oxla_process_memory_total, which includes cached files and can sit near the limit even when the node is idle, query_memory_consumption_total reflects only memory attributed to query execution.

OOM prevention cancellations

The cancelled due to OOM prevention error is a separate case. Redpanda SQL’s engine includes an overseer that monitors overall node memory independently of per-query accounting. When the overseer detects that the untracked memory pool has grown unexpectedly, it cancels running queries on the affected node to keep the engine operational.

This condition is rare and almost always indicates a bug in memory accounting or an unexpected workload pattern. Collect the cluster logs from around the time of the error and contact Redpanda Support.

What do you think of this page?

Let us know more:

Let us contact you about your feedback:

Troubleshoot Query Out-of-Memory Errors

How Redpanda SQL uses memory

Recover from the error

Monitor memory usage

OOM prevention cancellations

Suggested reading

Simple online edits

Contribution guide