# Compaction Settings

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [streaming-full.txt](https://docs.redpanda.com/streaming-full.txt)

---
title: Compaction Settings
latest-redpanda-tag: v25.1.1
latest-console-tag: v3.7.3
latest-operator-version: v26.1.4
# EOL = End-of-Life (support lifecycle status)
page-is-nearing-eol: "false"
page-is-past-eol: "true"
page-eol-date: April 7, 2026
latest-connect-version: 4.93.0
docname: cluster-maintenance/compaction-settings
page-component-name: streaming
page-version: "25.1"
page-component-version: "25.1"
page-component-title: Streaming
page-relative-src-path: cluster-maintenance/compaction-settings.adoc
page-edit-url: https://github.com/redpanda-data/docs/edit/v/25.1/modules/manage/pages/cluster-maintenance/compaction-settings.adoc
description: Redpanda's approach to compaction and options for configuring it.
page-git-created-date: "2023-12-22"
page-git-modified-date: "2025-04-07"
support-status: past end-of-life
---

<!-- Source: https://docs.redpanda.com/streaming/25.1/manage/cluster-maintenance/compaction-settings.md -->

Configure compaction for your cluster to optimize storage utilization.

## [](#redpanda-compaction-overview)Redpanda compaction overview

Compaction is an optional mechanism intended to reduce the storage needs of Redpanda topics. You can enable compaction through configuration of a cluster or topic’s cleanup policy. When compaction is enabled as part of the cleanup policy, a background process executes on a pre-set interval to perform compaction operations. When triggered for a partition, the process purges older versions of records for a given key and only retains the most recent record in that partition. This is done by analyzing closed segments in the partition, copying the most recent records for each key into a new segment, then deleting the source segments.

![Example of topic compaction](https://docs.redpanda.com/streaming/25.1/shared/_images/compaction-example.png)

This diagram illustrates a compacted topic. Imagine a remote sensor network that uses image recognition to track appearances of red pandas in a geographic area. The sensor network employs special devices that send records to a topic when they detect one. You might enable compaction to reduce topic storage while still maintaining a record in the topic of the last time each device saw a red panda, perhaps to see if they stop frequenting a given area. The left side of the diagram shows all records sent across the topic. The right side illustrates the results of compaction; older records for certain keys are deleted from the log.

> 📝 **NOTE**
>
> If your application requires consuming every record for a given key, consider using the `delete` [cleanup policy](https://docs.redpanda.com/streaming/25.1/develop/config-topics/#change-the-cleanup-policy.adoc) instead.

> ❗ **IMPORTANT**
>
> When using [Tiered Storage](https://docs.redpanda.com/streaming/25.1/manage/tiered-storage/), compaction functions at the local storage level. As long as a segment remains in local storage, its records are eligible for compaction. Once a segment is uploaded to object storage and removed from local storage it is not retrieved for further compaction operations. A key may therefore appear in multiple segments between Tiered Storage and local storage.

While compaction reduces storage needs, Redpanda’s compaction (just like Kafka’s) does not guarantee perfect de-duplication of a topic. It represents a best effort mechanism to reduce storage needs but duplicates of a key may still exist within a topic. Compaction is not a complete topic operation, either, since it operates on subsets of each partition within the topic.

## [](#configure-a-cleanup-policy)Configure a cleanup policy

A compaction policy may be applied to a cluster or to an individual topic. If both are set, the topic-level policy overrides the cluster-level policy. The cluster-level [`log_cleanup_policy`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#log_cleanup_policy) and the topic-level [`cleanup.policy`](https://docs.redpanda.com/streaming/25.1/reference/properties/topic-properties/#cleanuppolicy) support the following three options:

-   `delete`: Records are deleted from the topic once the specified retention period (time and/or size allocations) is exceeded. This is the default mechanism and is analogous to disabling compaction.

-   `compact`: This triggers only cleanup of records with multiple versions. A record that represents the only version for a given key is not deleted.

-   `compact,delete`: This combines both policies, deleting records exceeding the retention period while compacting multiple versions of records.


> ⚠️ **WARNING**
>
> Modifying the properties of topics that are created and managed by Redpanda applications can cause unexpected errors. This may lead to connector and cluster failures.

## [](#tune-log-compaction-with-a-dirty-ratio-threshold)Tune log compaction with a dirty ratio threshold

Use the dirty ratio to control when log compaction runs in compacted topics. The dirty ratio is the size of dirty segments divided by the total size of closed segments. Dirty segments are closed but un-compacted, meaning they may still contain duplicate keys that exist earlier in the log.

```none
dirty_ratio = dirty_segment_bytes / total_closed_segment_bytes
```

Where:

-   **Dirty segments** are closed segments that may contain duplicate keys that haven’t yet been compacted.

-   **Closed segments** are all finalized segments in the log.


### [](#configuration-options)Configuration options

| Property | Scope | Description |
| --- | --- | --- |
| min_cleanable_dirty_ratio | Cluster | The minimum ratio between the number of bytes in dirty segments and the total number of bytes in closed segments that must be reached before a partition’s log is eligible for compaction in a compact topic. |
| min.cleanable.dirty.ratio | Topic | Topic-level override of the cluster-wide dirty ratio threshold. |
| log_compaction_interval_ms | Cluster | Compaction frequency in milliseconds. |

Redpanda runs a scan every `log_compaction_interval_ms`. During each scan:

-   Logs are evaluated for compaction eligibility using their dirty ratio.

-   Only logs with a dirty ratio greater than the configured threshold are compacted.

-   Logs are compacted in descending order of dirty ratio to maximize efficiency.


### [](#use-cases-for-dirty-ratio-based-compaction)Use cases for dirty ratio-based compaction

| Use Case | Recommended Setting |
| --- | --- |
| High-throughput topics with frequent key overwrites | Lower min_cleanable_dirty_ratio to enable more aggressive compaction. |
| Topics with large segment sizes or expensive I/O | Raise min_cleanable_dirty_ratio to defer compaction until it is more efficient. |
| Topics requiring custom tuning | Use min.cleanable.dirty.ratio to override the cluster setting on specific topics. |

## [](#tombstone-record-removal)Tombstone record removal

Compaction also enables deletion of existing records through tombstones. For example, as data is deleted from a source system, clients produce a tombstone record to the log. A tombstone contains a key and the value `null`. Tombstones signal to brokers and consumers that records with the same key prior to it in the log should be deleted.

You can specify how long Redpanda keeps these tombstones for compacted topics using both a cluster configuration property `[tombstone_retention_ms](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#tombstone_retention_ms)` and a topic configuration property [`delete.retention.ms`](https://docs.redpanda.com/streaming/25.1/reference/properties/topic-properties/#deleteretentionms). If both are set, the topic-level tombstone retention policy overrides the cluster-level policy.

> 📝 **NOTE**
>
> Redpanda does not currently remove tombstone records for compacted topics with Tiered Storage enabled.
>
> You cannot enable `tombstone_retention_ms` if you have enabled any of the Tiered Storage cluster properties `cloud_storage_enabled`, `cloud_storage_enable_remote_read`, and `cloud_storage_enable_remote_write`.
>
> On the topic level, you cannot enable `delete.retention.ms` at the same time as the Tiered Storage topic configuration properties `redpanda.remote.read` and `redpanda.remote.write`.

To set the cluster-level tombstone retention policy, run the command:

```bash
rpk cluster config set tombstone_retention_ms=100
```

You can unset the tombstone retention policy for a topic so it inherits the cluster-wide default policy:

```bash
rpk topic alter-config <topic-name> --delete delete.retention.ms
```

To override the cluster-wide default for a specific topic:

```bash
rpk topic alter-config <topic-name> --set delete.retention.ms=5
```

To disable tombstone removal for a specific topic:

```bash
rpk topic alter-config <topic-name> --set delete.retention.ms=-1
```

Redpanda removes tombstones as follows:

-   For topics with a `compact` only cleanup policy: Tombstones are removed when the topic exceeds the tombstone retention limit. The `delete.retention.ms` or `tombstone_retention_ms` values therefore also set the time bound that a consumer has in order to see a complete view of the log with tombstones present before they are removed.

-   For topics with a `compact,delete` cleanup policy: Both the tombstone retention policy and standard garbage collection can remove tombstone records.


If obtaining a complete snapshot of the log, including tombstone records, is important to your consumers, set the tombstone retention value such that consumers have enough time for their reads to complete before tombstones are removed. Consumers may not see tombstones if their reads take longer than `delete.retention.ms` and `tombstone_retention_ms`. The trade-offs to ensuring tombstone visibility to consumers are increased disk usage and potentially slower compaction.

On the other hand, if more frequent cleanup of tombstones is important for optimizing workloads and space management, consider setting a shorter tombstone retention, for example the typical default of 24 hours (86400000 ms).

## [](#compaction-policy-settings)Compaction policy settings

The various cleanup policy settings rely on proper tuning of a cluster’s compaction and retention policy options. The applicable settings are:

-   [`log_compaction_interval`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#log_compaction_interval_ms): Defines the compaction frequency in milliseconds. (default: 10,000ms)

-   [`min_cleanable_dirty_ratio`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#min_cleanable_dirty_ratio): Minimum dirty ratio a log must exceed to be eligible for compaction.

-   [`compaction_ctrl_backlog_size`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_backlog_size): Defines the size for the compaction backlog of the backlog controller. (default: 10% of disk capacity)

-   [`compaction_ctrl_min_shares`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_min_shares): Defines the minimum number of I/O and CPU shares the compaction process can use. (default: 10)

-   [`compaction_ctrl_max_shares`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_max_shares): Defines the maximum number of I/O and CPU shares the compaction process can use. (default: 1,000)

-   [`storage_compaction_index_memory`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#storage_compaction_index_memory): Defines the amount of memory in bytes that each shard may use for creating the compaction index. This index optimizes execution during compaction operations. (default: 128 MiB)

-   `storage_compaction_key_map_memory`: Defines the amount of memory in bytes that each shard may use when creating the key map for a partition during compaction operations. The compaction process uses this key map to de-dupe keys within the compacted segments. (default: 128 MiB)

-   [`compacted_log_segment_size`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compacted_log_segment_size): Defines the base size for a compacted log segment in bytes. (default: 268435456 \[256 MiB\])

-   [`max_compacted_log_segment_size`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#max_compacted_log_segment_size): Defines the maximum size after consolidation for a compacted log segment in bytes. (default: 5368709120 \[5 GiB\])


> 📝 **NOTE**
>
> Additional [tunable properties](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/) are available but should only be used with direction from Redpanda support. These properties include [`compaction_ctrl_p_coeff`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_p_coeff), [`compaction_ctrl_i_coeff`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_i_coeff), [`compaction_ctrl_d_coeff`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_d_coeff), and [`compaction_ctrl_update_interval_ms`](https://docs.redpanda.com/streaming/25.1/reference/properties/cluster-properties/#compaction_ctrl_update_interval_ms).

## Suggested labs

-   [Redpanda Iceberg Docker Compose Example](https://docs.redpanda.com/labs/docker-compose/iceberg/)
-   [Enable Unified Identity with Azure Entra ID for Redpanda and Redpanda Console](https://docs.redpanda.com/labs/docker-compose/oidc/)
-   [Owl Shop Example Application in Docker](https://docs.redpanda.com/labs/docker-compose/owl-shop/)
-   [Migrate Data with Redpanda Migrator](https://docs.redpanda.com/labs/docker-compose/redpanda-migrator/)
-   [Start a Single Redpanda Broker with Redpanda Console in Docker](https://docs.redpanda.com/labs/docker-compose/single-broker/)
-   [Start a Cluster of Redpanda Brokers with Redpanda Console in Docker](https://docs.redpanda.com/labs/docker-compose/three-brokers/)
-   [Iceberg Streaming on Kubernetes with Redpanda, MinIO, and Spark](https://docs.redpanda.com/labs/kubernetes/iceberg/)

See more

[Search all labs](https://docs.redpanda.com/labs)