# Redpanda Schema Registry

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: Redpanda Schema Registry
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: schema-reg/schema-reg-overview
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: schema-reg/schema-reg-overview.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/manage/pages/schema-reg/schema-reg-overview.adoc
description: Redpanda's Schema Registry provides the interface to store and manage event schemas.
page-git-created-date: "2024-07-25"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-overview.md -->

In Redpanda, the messages exchanged between producers and consumers contain raw bytes. Schemas enable producers and consumers to share the information needed to serialize and deserialize those messages. They register and retrieve the schemas they use in the Schema Registry to ensure data verification.

Schemas are versioned, and the registry supports configurable compatibility modes between schema versions. When a producer or a consumer requests to register a schema change, the registry checks for schema compatibility and returns an error for an incompatible change. Compatibility modes can ensure that data flowing through a system is well-structured and easily evolves.

> ❗ **IMPORTANT**
>
> **Schema size best practice**: Schema Registry works best with schemas of 128KB in size or less. Large schemas can consume significant memory resources and may cause system instability or crashes, particularly in memory-constrained environments. For Protobuf and Avro schemas, Redpanda recommends using schema [references](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-api/#reference-a-schema) to break up large schemas into smaller constituent parts.

> 📝 **NOTE**
>
> The Schema Registry is built directly into the Redpanda binary. It runs out of the box with Redpanda’s default configuration, and it requires no new binaries to install and no new services to deploy or maintain. You can use it with the [Schema Registry API](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-api/) or [Redpanda Cloud](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-ui/).

## [](#schema-terminology)Schema terminology

**Schema**: A schema is an external mechanism to describe the structure of data and its encoding. Producer clients and consumer clients use a schema as an agreed-upon format for sending and receiving messages. Schemas enable a loosely coupled, data-centric architecture that minimizes dependencies in code, between teams, and between producers and consumers.

**Subject**: A subject is a logical grouping for schemas. When data formats are updated, a new version of the schema can be registered under the same subject, allowing for backward and forward compatibility. A subject may have more than one schema version assigned to it, with each schema having a different numeric ID.

**Serialization format**: A serialization format defines how data is converted into bytes that are transmitted and stored. Serialization, by producers, converts an event into bytes. Redpanda then stores these bytes in topics. Deserialization, by consumers, converts the bytes of arrays back into the desired data format. Redpanda’s Schema Registry supports Avro, Protobuf, and JSON serialization formats.

**Normalization**: Normalization is the process of converting a schema into a canonical form. When a schema is normalized, it can be compared and considered equivalent to another schema that may contain minor syntactic differences. Schema normalization allows you to more easily manage schema versions and compatibility by prioritizing meaningful logical changes. Normalization is supported for Avro, JSON, and Protobuf formats during both schema registration and lookup for a subject.

## [](#redpanda-design-overview)Redpanda design overview

Every broker allows mutating REST calls, so there’s no need to configure leadership or failover strategies. Schemas are stored in a compacted topic, and the registry uses optimistic concurrency control at the topic level to detect and avoid collisions.

> ❗ **IMPORTANT**
>
> The Schema Registry publishes an internal topic, `_schemas`, as its backend store. This internal topic is reserved strictly for schema metadata and support purposes. **Do not directly edit or manipulate the `_schemas` topic unless directed to do so by Redpanda Support.**

Redpanda Schema Registry uses the default port 8081.

## [](#wire-format)Wire format

With Schema Registry, producers and consumers can use a specific message format, called the wire format. The wire format facilitates a seamless transfer of data by ensuring that clients easily access the correct schema in the Schema Registry for a message.

The wire format is a sequence of bytes consisting of the following:

1.  The "magic byte," a single byte that always contains the value of 0.

2.  A four-byte integer containing the schema ID.

3.  The rest of the serialized message.


![Schema Registry wire format](https://docs.redpanda.com/cloud-data-platform/shared/_images/schema-registry-wire-format.png)

In the serialization process, the producer hands over the message to a key/value serializer that is part of the respective language-specific SDK. The serializer first checks whether the schema ID for the given subject exists in the local schema cache. The serializer derives the subject name based on several strategies, such as the topic name. You can also explicitly set the subject name.

If the schema ID isn’t in the cache, the serializer registers the schema in the Schema Registry and collects the resulting schema ID in the response.

In either case, when the serializer has the schema ID, it pads the beginning of the message with the magic byte and the encoded schema ID, and returns the byte sequence to the producer to write to the topic.

In the deserialization process, the consumer fetches messages from the broker and hands them over to a deserializer. The deserializer first checks the presence of the magic byte and rejects the message if it doesn’t follow the wire format.

The deserializer then reads the schema ID and checks whether that schema exists in its local cache. If it finds the schema, it deserializes the message according to that schema. Otherwise, the deserializer retrieves the schema from the Schema Registry using the schema ID, then the deserializer proceeds with deserialization.

## [](#schema-examples)Schema examples

To experiment with schemas from applications, see the clients in [redpanda-labs](https://github.com/redpanda-data/redpanda-labs/tree/main).

For a basic end-to-end example, the following Protobuf schema contains information about products: a unique ID, name, price, and category. It has a schema ID of 1, and the Topic name strategy, with a topic of Orders. (The Topic strategy is suitable when you want to group schemas by the topics to which they are associated.)

```json
syntax = "proto3";

message Product {
  int32 ProductID = 1;
  string ProductName = 2;
  double Price = 3;
  string Category = 4;
}
```

The producer then does something like this:

```json
from kafka import KafkaProducer
from productpy import Product  # This imports the prototyped schema

# Create a Kafka producer
producer = KafkaProducer(bootstrap_servers='your_kafka_brokers')

# Create a Product message
product_message = Product(
    ProductID=123,
    ProductName="Example Product",
    Price=45.99,
    Category="Electronics"
)

# Produce the Product message to the "Orders" topic
producer.send('Orders', key='product_key', value=product_message.SerializeToString())
```

To add an additional field for product variants, like size or color, the new schema (version 2, ID 2) would look like this:

```json
syntax = "proto3";

message Product {
  int32 ProductID = 1;
  string ProductName = 2;
  double Price = 3;
  string Category = 4;
  repeated string Variants = 5;
}
```

You would want the compatibility setting to accommodate adding new fields without breakage. Adding an optional new field to a schema is inherently backward-compatible. New consumers can process events written with the new schema, and older consumers can ignore it.

## [](#json-schema)JSON Schema

All CRUD operations are supported for the JSON Schema (`json-schema`), and Redpanda supports [all published JSON Schema specifications](https://json-schema.org/specification), which include:

-   draft-04

-   draft-06

-   draft-07

-   2019-09

-   2020-12


### [](#limitations)Limitations

Schemas are held in subjects. Subjects have a compatibility configuration associated with them, either directly specified by a user, or inherited by the default. See `PUT /config` and `PUT/config/{subject}` in the [Schema Registry API](https://docs.redpanda.com/api/doc/schema-registry/).

If you have inserted a second schema into a subject where the compatibility level is anything but `NONE`, then any JSON Schema containing the following items are rejected:

-   `$ref`

-   `$defs` (`definitions` prior to draft 2019-09)

-   `dependentSchemas` / `dependentRequired` (`dependencies` prior to draft 2019-09)

-   `prefixItems`


Consequently, you cannot [structure a complex schema](https://json-schema.org/understanding-json-schema/structuring) using these features.

## [](#metadata-properties)Metadata properties

Schema Registry lets you store and retrieve arbitrary key-value metadata properties alongside schemas. Properties such as `owner`, `team`, or `application.version` travel with the schema through its lifecycle. You can register a new schema with associated metadata properties by sending a `POST` request to `/subjects/{subject}/versions` with a `metadata.properties` object in the request body:

```json
{
  "schema": "{\"type\":\"string\"}",
  "metadata": {
    "properties": {
      "owner": "platform-team",
      "application.version": "2.1.0"
    }
  }
}
```

To set metadata using rpk, use the [`--metadata-properties`](https://docs.redpanda.com/cloud-data-platform/reference/rpk/rpk-registry/rpk-registry-schema-create/) flag (shorthand: `-p`). The flag accepts `key=value` pairs or a JSON string (for example, `{"key":"value"}`), and you can pass it multiple times to set multiple properties:

```bash
# key=value pairs — pass the flag multiple times for multiple properties
rpk registry schema create my-subject --schema schema.avsc \
  --metadata-properties owner=platform-team \
  --metadata-properties env=prod

# JSON string — useful when values contain special characters
rpk registry schema create my-subject --schema schema.avsc \
  --metadata-properties '{"owner":"platform-team","application.version":"2.1.0"}'
```

Metadata properties are returned on `GET /subjects/{subject}/versions/{version}` and `GET /schemas/ids/{id}` responses. To view metadata on an existing schema, add `--print-metadata` to `rpk registry schema get`. You can also view metadata properties in Redpanda Cloud.

When you register a new schema version without a `metadata` field, the new version automatically inherits properties from the most recent version of that subject. To avoid inheriting the previous version’s metadata, you can send `"metadata": {}` to register a schema with explicitly no metadata. Registering the same schema definition with different metadata properties creates a new schema version.

> 📝 **NOTE**
>
> Redpanda supports only `metadata.properties` from the Confluent Data Contracts specification. The following configuration objects are not supported:
>
> -   `metadata.tags`
>
> -   `ruleSet`
>
> -   `defaultMetadata` and `overrideMetadata` (configuration options)
>
> -   `defaultRuleSet` and `overrideRuleSet` (configuration options)
>
> -   `compatibilityGroup` (configuration option)

## [](#next-steps)Next steps

-   [Use the Schema Registry API](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-api/)

-   [Schema Registry Contexts](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-contexts/)


## [](#suggested-reading)Suggested reading

-   [Schema Registry API](https://docs.redpanda.com/api/doc/schema-registry/)

-   [Deserialization](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/record-deserialization/)

-   [Monitor Schema Registry service-level metrics](https://docs.redpanda.com/cloud-data-platform/manage/monitor-cloud/#service-level-queries)