This topic includes new content added in version 23.2. For a complete list of all product updates, see the Redpanda release notes.
Follower fetching lets a consumer fetch records from the closest replica of a topic partition, regardless of whether it’s a leader or a follower. This can minimize cloud networking costs and consumer read latency for clusters deployed across different data centers and availability zones.
With this enterprise feature, records produced to a topic may use a serializer/deserializer client library, such as Confluent’s SerDes library, to encode their keys and values according to a schema in Schema Registry. Schema ID validation enables brokers to detect and drop records that don’t match the configured schema, as identified by the schema ID. Records associated with unregistered or incorrect schemas are thus detected and dropped earlier, by a broker rather than a downstream consumer.
As data in object storage grows, the metadata for it grows. To support efficient long-term data retention, Redpanda splits the metadata in object storage, maintaining metadata of only recently-updated segments in memory or local disk, while safely archiving the remaining metadata in object storage and caching it locally on disk. Archived metadata is then loaded only when historical data is accessed. This allows Tiered Storage to handle partitions of virtually any size or retention length.
Fine-grained caching: To support more concurrent consumers of historical data with less local storage, Redpanda can download small chunks of remote segments to the cache directory. For example, when a client fetch request spans a subsection of a 1 GiB segment, instead of downloading the entire 1 GiB segment, Redpanda can download 16 MiB chunks that contain just enough data required to fulfill the fetch request.
Automatic disk space management: Redpanda now divides disk storage into different categories to provide a flexible configuration of space: reserved disk space, cache storage, and log storage. When data usage begins to approach the target size of log storage, Redpanda does local storage housekeeping to bring usage back under the target size. This allows Redpanda to leverage available local storage safely and efficiently.
View space usage: You can now use
rpk cluster logdirs describeto get details about Tiered Storage space usage in both object storage and local disk.
Redpanda now loads the controller log from a snapshot on startup. Controller snapshots save the current cluster metadata state to disk, which significantly improves startup times of nodes in long-running Redpanda clusters. For example, with a partition that has moved several times, a snapshot can restore the latest state without replaying every move command.
Redpanda enables you to delete data from the beginning of a partition up to a specific offset. The offset represents the true creation time of the event, not the time when it was stored by Redpanda. Deleting records frees up space in local disk and in object storage, which is especially helpful if your producers are pushing more data than you anticipated when sizing your storage infrastructure, or if you want to implement a data retention policy aligned with a particular business event and not based on age or size. There are different ways to delete records from a topic, including using the
rpk topic trim command or using the DeleteRecords Kafka API with Kafka clients.
The new default
leader_balancer_mode property ensures that each shard in a cluster is assigned an equal number of partition leaders and attempts to spread every topic’s partition leaders evenly across all brokers in a cluster.
Additionally, Redpanda has improved data balancing, ensuring that a topic’s partitions (not just leaders) are evenly distributed across a cluster. It allocates partitions to random healthy brokers, to avoid topic hotspots, without needing to wait for a batch of moves to finish before it schedules the next batch.
With this release, both size-based and time-based retention policies are applied simultaneously, so it’s possible for your size-based property to override your time-based property, or vice versa. For example, if your size-based property requires removing one segment, and your time-based property requires removing three segments, then three segments are removed. Size-based properties reclaim disk space as close as possible to the maximum size, without exceeding the limit.
You can now manage client throughput of Kafka ingress and egress traffic allowed through each node with a single configuration setting at the cluster level.
Redpanda offers a new approach to deploying Redpanda in Kubernetes using an operator. You can still choose to use Helm for its simplicity, or you can use the Redpanda Operator for a more GitOps-friendly declarative deployment process.
To secure Redpanda Console using TLS, you can either let Redpanda Console handle TLS termination or you can offload it to an upstream component, such as a reverse proxy or a cloud HTTPS load balancer. TLS termination is the process of decrypting incoming TLS-encrypted traffic.
A connectors Docker image is now available for integrating your Redpanda data with different data systems. You can use the Redpanda Console or Kafka Connect REST API to manage connectors.
New commands and properties in this release include the following:
rpk profile: This allows users to switch between clusters seamlessly.
rpk cloud login: This allows users to connect to cloud environments using a browser.
rpk topic trim-prefix: This allows users to delete records from a topic.
rpk topic describe-storage: This provides detailed information about storage, including local and object disk usage and time of last write to object storage.
legacy_unsafe_log_warning_interval_sec: These properties enable a Redpanda cluster operator to use unsafe control characters within strings, such as consumer group names or user names.