What’s New

This topic includes new content added in version 23.2. For a complete list of all product updates, see the Redpanda release notes.

Follower fetching

Follower fetching lets a consumer fetch records from the closest replica of a topic partition, regardless of whether it’s a leader or a follower. This can minimize cloud networking costs and consumer read latency for clusters deployed across different data centers and availability zones.

Server-side schema ID validation

With this enterprise feature, records produced to a topic may use a serializer/deserializer client library, such as Confluent’s SerDes library, to encode their keys and values according to a schema in Schema Registry. Schema ID validation enables brokers to detect and drop records that don’t match the configured schema, as identified by the schema ID. Records associated with unregistered or incorrect schemas are thus detected and dropped earlier, by a broker rather than a downstream consumer.

Tiered Storage enhancements

As data in object storage grows, the metadata for it grows. To support efficient long-term data retention, Redpanda splits the metadata in object storage, maintaining metadata of only recently-updated segments in memory or local disk, while safely archiving the remaining metadata in object storage and caching it locally on disk. Archived metadata is then loaded only when historical data is accessed. This allows Tiered Storage to handle partitions of virtually any size or retention length.

  • Fine-grained caching: To support more concurrent consumers of historical data with less local storage, Redpanda can download small chunks of remote segments to the cache directory. For example, when a client fetch request spans a subsection of a 1 GiB segment, instead of downloading the entire 1 GiB segment, Redpanda can download 16 MiB chunks that contain just enough data required to fulfill the fetch request.

  • Automatic disk space management: Redpanda now divides disk storage into different categories to provide a flexible configuration of space: reserved disk space, cache storage, and log storage. When data usage begins to approach the target size of log storage, Redpanda does local storage housekeeping to bring usage back under the target size. This allows Redpanda to leverage available local storage safely and efficiently.

  • View space usage: You can now use rpk cluster logdirs describe to get details about Tiered Storage space usage in both object storage and local disk.

Controller snapshots

Redpanda now loads the controller log from a snapshot on startup. Controller snapshots save the current cluster metadata state to disk, which significantly improves startup times of nodes in long-running Redpanda clusters. For example, with a partition that has moved several times, a snapshot can restore the latest state without replaying every move command.

Delete records from a topic

Redpanda enables you to delete data from the beginning of a partition up to a specific offset. The offset represents the true creation time of the event, not the time when it was stored by Redpanda. Deleting records frees up space in local disk and in object storage, which is especially helpful if your producers are pushing more data than you anticipated when sizing your storage infrastructure, or if you want to implement a data retention policy aligned with a particular business event and not based on age or size. There are different ways to delete records from a topic, including using the rpk topic trim command or using the DeleteRecords Kafka API with Kafka clients.

Topic-aware leadership balancing

The new default leader_balancer_mode property ensures that each shard in a cluster is assigned an equal number of partition leaders and attempts to spread every topic’s partition leaders evenly across all brokers in a cluster.

Additionally, Redpanda has improved data balancing, ensuring that a topic’s partitions (not just leaders) are evenly distributed across a cluster. It allocates partitions to random healthy brokers, to avoid topic hotspots, without needing to wait for a batch of moves to finish before it schedules the next batch.

Local message retention

With this release, both size-based and time-based retention policies are applied simultaneously, so it’s possible for your size-based property to override your time-based property, or vice versa. For example, if your size-based property requires removing one segment, and your time-based property requires removing three segments, then three segments are removed. Size-based properties reclaim disk space as close as possible to the maximum size, without exceeding the limit.

Client throughput quotas

You can now manage client throughput of Kafka ingress and egress traffic allowed through each node with a single configuration setting at the cluster level.

Kubernetes Operator

Redpanda offers a new approach to deploying Redpanda in Kubernetes using an operator. You can still choose to use Helm for its simplicity, or you can use the Redpanda Operator for a more GitOps-friendly declarative deployment process.

Redpanda Console enhancements

  • To secure Redpanda Console using TLS, you can either let Redpanda Console handle TLS termination or you can offload it to an upstream component, such as a reverse proxy or a cloud HTTPS load balancer. TLS termination is the process of decrypting incoming TLS-encrypted traffic.

  • SSO authentication for Azure AD and Keycloak.

  • A connectors Docker image is now available for integrating your Redpanda data with different data systems. You can use the Redpanda Console or Kafka Connect REST API to manage connectors.

New commands and properties

New commands and properties in this release include the following:

Documentation improvements

Although not specific to version 23.2, the following sections are new or have been significantly enhanced in the documentation:

Next steps