Glossary

Redpanda Cloud

BYOC

Bring Your Own Cloud (BYOC) is a fully-managed Redpanda Cloud deployment where clusters run in your private cloud, so all data is contained in your own environment. Redpanda handles provisioning, operations, and maintenance.

connector

Enables Redpanda to integrate with external systems, such as databases.

control plane

This part of Redpanda Cloud enforces rules in the data plane, including cluster management, operations, and maintenance.

data plane

This part of Redpanda Cloud contains Redpanda clusters and other components, such as Redpanda Console, Redpanda Operator, and rpk. It is managed by an agent that receives cluster specifications from the control plane. Sometimes used interchangeably with clusters.

Dedicated Cloud

A fully-managed Redpanda Cloud deployment option where you host your data in Redpanda’s VPC, and Redpanda handles provisioning, operations, and maintenance. Dedicated clusters are single-tenant deployments that support private networking (for example, VPC peering to talk over private IPs) for better data isolation.

Redpanda Cloud

A fully-managed data streaming service deployed with Redpanda Console. It includes automated upgrades and patching, backup and recovery, data and partition balancing, and built-in connectors. It is available in Serverless Standard, Serverless Pro, Dedicated, and Bring Your Own Cloud (BYOC) deployment options to suit different data sovereignty and infrastructure requirements.

Redpanda Console

The web-based UI for managing and monitoring Redpanda clusters and streaming workloads. You can also set up and manage connectors in Redpanda Console. Redpanda Console is an integral part of Redpanda Cloud, but it also can be used as a standalone program as part of a Redpanda Self-Managed deployment.

resource group

A container for Redpanda Cloud resources, including clusters and networks. You can rename your default resource group, and you can create more resource groups. For example, you may want different resource groups for production and testing.

Serverless

Serverless is the fastest and easiest way to start data streaming. You host your data in Redpanda’s VPC, and Redpanda handles automatic scaling, provisioning, operations, and maintenance. Serverless Pro is the enterprise version of Serverless Standard. It includes higher usage limits and dedicated support.

sink connector

Exports data from a Redpanda cluster into a target system.

source connector

Imports data from a source system into a Redpanda cluster.

Redpanda core

availability zone (AZ)

One or more data centers served by high-bandwidth links with low latency, typically within a close distance of one another.

broker

An instance of Redpanda that stores and manages event streams. Multiple brokers join together to form a Redpanda cluster.

Sometimes used interchangeably with node, but a node is typically a physical or virtual server.

See also: node

client

A producer application that writes events to Redpanda, or a consumer application that reads events from Redpanda.

This could also be a client library, like librdkafka or franz-go.

cluster

One or more brokers that work together to manage real-time data streaming, processing, and storage.

consumer group

A set of consumers that cooperate to read data for better scalability. As group members arrive and leave, partitions are re-assigned so each member receives a proportional share.

consumer offset

The position of a consumer in a specific topic partition, to track which records they have read. A consumer offset of 3 means it has read messages 0-2 and will next read message 3.

consumer

A client application that subscribes to Redpanda topics to asynchronously read events.

controller broker

A broker that manages operational metadata for a Redpanda cluster and ensures replicas are distributed among brokers. At any given time, one active controller exists in a cluster. If the controller fails, another broker is automatically elected as the controller.

data sovereignty

Containing all your data in your environment.

With BYOC, Redpanda handles provisioning, monitoring, and upgrades, but you manage your streaming data without Redpanda’s control plane ever seeing it. Additionally, with a customer-managed VPC, the Redpanda Cloud agent doesn’t create any new resources or alter any settings in your account.

data stream

A continuous flow of events in real time that are produced and consumed by client applications. Redpanda is a data streaming platform. Also known as event stream.

event

A record of something changing state at a specific time. Events can be generated by various sources, including sensors, applications, and devices. Producers write events to Redpanda, and consumers read events from Redpanda.

Kakfa API

Producers and consumers interact with Redpanda using the Kafka API. It uses the default port 9092.

learner

A broker that is a follower in a Raft group but is not part of quorum.

In a Raft group, a broker can be in learner status. Learners are followers that cannot vote and so do not count towards quorum (the majority). They cannot be elected to leader nor can they trigger leader elections. Brokers can be promoted or demoted between learner and voter. New Raft group members start as learners.

For more information, see Raft Group Reconfiguration.

listener

Configuration on a broker that defines how it should accept client or inter-broker connections. Each listener is associated with a specific protocol, hostname, and port combination. The listener defines where the broker should listen for incoming connections.

For more information, see Configure Listeners.

message

One or more records representing individual events being transmitted. Redpanda transfers messages between producers and consumers.

Sometimes used interchangeably with record.

node

A machine, which could be a server, a virtual machine (instance), or a Docker container. Every node has its own disk. Partitions are stored locally on nodes. In Kubernetes, a Node is the machine that Redpanda runs on. Outside the context of Kubernetes, this term may be used interchangeably with broker, such as node_id.

See also: broker

offset commit

An acknowledgement that the event has been read.

offset

A unique integer assigned to each record to show its location in the partition.

pandaproxy

Original name for the subsystem of Redpanda that allows access to your data through a REST API. This name still appears in the HTTP Proxy API and the Schema Registry API.

partition leader

Every Redpanda partition forms a Raft group with a single elected leader. This leader handles all writes, and it replicates data to followers to ensure that a majority of brokers store the data.

partition

A subset of events in a topic, like a log file. It is an ordered, immutable sequence of records. Partitions allow you to distribute a stream, which lets producers write messages in parallel and consumers read messages in parallel. Partitions are made up of segment files on disk.

producer

A client application that writes events to Redpanda. Redpanda stores these events in sequence and organizes them into topics.

For more information, see Configure Producers.

rack

A failure zone that has one or more Redpanda brokers assigned to it.

Raft

The consensus algorithm Redpanda uses to coordinate writing data to log files and replicating that data across brokers.

For more details, see https://raft.github.io/

record

A self-contained data entity with a defined structure, representing a single event.

Sometimes used interchangeably with message.

replicas

Copies of partitions that are distributed across different brokers, so if one broker goes down, there is a copy of the data.

retention

The mechanism for determining how long Redpanda stores data on local disk or in object storage before purging it.

For more information, see Manage Disk Space.

replication factor

The number of partition copies in a cluster. This is set to 3 in Redpanda Cloud deployments and 1 (no replication) in Self-Managed deployments. A replication factor of at least 3 ensures that each partition has a copy of its data on at least one other broker. One replica acts as the leader, and the other replicas are followers.

schema

An external mechanism to describe the structure of data and its encoding. Schemas validate the structure and ensure that producers and consumers can connect with data in the same format.

Seastar

An open-source thread-per-core C++ framework, which binds all work to physical cores. Redpanda is built on Seastar.

For more details, see https://seastar.io/

seed server

The initial set of brokers that a Redpanda broker contacts to join the cluster. Seed servers play a crucial role in cluster formation and recovery, acting as a point of reference for new or restarting brokers to understand the current topology of the cluster.

segment

Discrete part of a partition, used to break down a continuous stream into manageable chunks. You can set the maximum duration (segment.ms) or size (segment.bytes) for a segment to be open for writes.

serialization

The process of converting a record into a format that can be stored. Deserialization is the process of converting a record back to the original state. Redpanda Schema Registry supports Avro and Protobuf serialization formats.

shard

A CPU core.

subject

A logical grouping or category for schemas. When data formats are updated, a new version of the schema can be registered under the same subject, allowing for backward and forward compatibility.

thread-per-core

Programming model that allows Redpanda to pin each of its application threads to a CPU core to avoid context switching and blocking.

topic partition

A topic may be partitioned through multiple brokers. A "topic partition" represents this logical separation in Redpanda, which is managed natively by Raft.

topic

A logical stream of related events that are written to the same log. It can be divided into multiple partitions. A topic can have various clients writing events to it and reading events from it.

Redpanda features

Admin API

A REST API used to manage and monitor Redpanda clusters. It uses the default port 9644.

For more information, see Admin API.

Note: The Redpanda Admin API is different from the Kafka Admin API.

compaction

Feature that retains the latest value for each key within a partition while discarding older values.

For more information, see Compaction Settings.

controller snapshot

Snapshot of the current cluster metadata state saved to disk, so broker startup is fast.

data transforms

Framework to manipulate or enrich data written to Redpanda topics. You can develop custom data functions, which run asynchronously using a WebAssembly (Wasm) engine inside a Redpanda broker.

For more information, see Data Transforms.

HTTP Proxy

Redpanda HTTP Proxy (pandaproxy) allows access to your data through a REST API. It is built into the Redpanda binary and uses the default port 8082.

maintenance mode

A state where a Redpanda broker temporarily doesn’t take any partition leaderships. It continues to store data as a follower. This is usually done for system maintenance or a rolling upgrade.

For more information, see Maintenance Mode.

rack awareness

Feature that lets you distribute replicas of the same partition across different racks to minimize data loss and improve fault tolerance in the event of a rack failure.

For more information, see Enable Rack Awareness.

rebalancing

Process of moving partition replicas and transferring partition leadership for improved performance.

Redpanda provides various topic-aware tools to balance clusters for best performance.

  • Leadership balancing changes where data is written to first, but it does not involve any data transfer. The partition leader regularly sends heartbeats to its followers. If a follower does not receive a heartbeat within a timeout, it triggers a new leader election. Redpanda also provides leadership balancing when brokers are added or decommissioned.

  • Partition replica balancing moves partition replicas to alleviate disk pressure and to honor the configured replication factor across brokers and the additional redundancy across failure domains (such as racks). Redpanda provides partition replica rebalancing when brokers are added or decommissioned.

  • With an Enterprise license, you can additionally enable Continuous Data Balancing to continuously monitor broker and rack availability and disk usage.

For more information, see Cluster Balancing.

rolling upgrade

The process of upgrading each broker in a Redpanda cluster, one at a time, to minimize disruption and ensure continuous availability.

For more information, see Upgrade Redpanda.

rpk

Redpanda’s command-line interface tool for managing Redpanda clusters.

Remote Read Replica

A read-only topic that mirrors a topic on a different cluster, using data from Tiered Storage.

Schema Registry

Redpanda Schema Registry (pandaproxy) is the interface for storing and managing event schemas. Producers and consumers register and retrieve schemas they use from the registry. It is built into the Redpanda binary and uses the default port 8081.

For more information, see Schema Registry.

Tiered Storage

Feature that lets you offload log segments to object storage in near real-time, providing long-term data retention and topic recovery.

For more information, see Use Tiered Storage.

Redpanda in Kubernetes

cert-manager

A Kubernetes controller that simplifies the process of obtaining, renewing, and using certificates.

For more details, see https://cert-manager.io/docs/

Redpanda Helm chart

Generates and applies all the manifest files you need for deploying Redpanda in Kubernetes.

For more information, see Redpanda in Kubernetes.

Redpanda Operator

Extends Kubernetes with custom resource definitions (CRDs), which allow Redpanda clusters to be treated as native Kubernetes resources.

For more information, see Redpanda in Kubernetes.

Redpanda licenses

Redpanda Community Edition

Redpanda software that is available under the Redpanda Business Source License (BSL). These core features are free and source-available.

Redpanda Enterprise Edition

Redpanda software that is available under the Redpanda Community License (RCL). It includes the free features licensed with the Redpanda Community Edition, as well enterprise features, such as Tiered Storage, Remote Read Replicas, and Continuous Data Balancing.

Self-Managed

Redpanda Self-Managed refers to the product offering that includes both the Enterprise Edition and the Community Edition of Redpanda.

Sometimes used interchangeably with self-hosted.

Redpanda security

access control list (ACL)

A security feature used to define and enforce granular permissions to resources, ensuring only authorized users or applications can perform specific operations. ACLs act on principals.

For more information, see Redpanda Authorization Mechanisms.

advertised listener

The address a Redpanda broker broadcasts to producers, consumers, and other brokers. It specifies the hostname and port for connections to different listeners. Clients and other brokers use advertised listeners to connect to services such as the Admin API, Kafka API, and HTTP Proxy API. The advertised address might differ from the listener address in scenarios where brokers are behind a NAT, in a Docker container, or in Kubernetes. Advertised addresses ensure clients can reach the Redpanda brokers even in complex network setups.

authentication

The process of verifying the identity of a principal, user, or service account.

For more information, see Configure Authentication.

authorization

The process of specifying access rights to resources. Access rights are enforced through roles or access-control lists (ACLs).

For more information, see Redpanda Authorization Mechanisms.

bearer token

An access token used for authentication and authorization in web applications and APIs. It holds user credentials, usually in the form of random strings of characters.

For more information, see Configure Authentication.

principal

An entity (such as a user account or a service account) that accesses resources. Principals can be authenticated and granted permissions based on roles to perform operations.

service account

An identity independent of the user who created it that can be used to authenticate and perform operations. This is especially useful for authentication of machines.