Manage Schemas with the Redpanda Operator

Use the Schema resource to declaratively create and manage schemas as part of a Redpanda deployment in Kubernetes. Each Schema resource maps to a schema in your Redpanda cluster, allowing you to define data structures, compatibility, and schema evolution in a declarative way.

Prerequisites

Ensure you have the following:

  • Kubectl: Ensure the kubectl command-line tool is installed and configured to communicate with your cluster.

  • Redpanda cluster: Ensure you have at least version v2.3.0-24.3.1 of the Redpanda Operator and a Redpanda resource deployed and accessible.

Create a schema

  1. Define a schema using the Schema resource. Here’s a basic example configuration that defines an Avro schema:

    schema.yaml
    apiVersion: cluster.redpanda.com/v1alpha2
    kind: Schema
    metadata:
      name: example-schema
      namespace: <namespace>
    spec:
      cluster:
        clusterRef:
          name: <cluster-name>
      schemaType: avro
      compatibilityLevel: Backward
      text: |
        {
          "type": "record",
          "name": "ExampleRecord",
          "fields": [
            { "type": "string", "name": "field1" },
            { "type": "int", "name": "field2" }
          ]
        }

    Replace the following placeholders:

    • <namespace>: The namespace in which to deploy the Schema resource. The Schema resource must be deployed in the same namespace as the Redpanda resource defined in clusterRef.name.

    • <cluster-name>: The name of the Redpanda resource that defines the Redpanda cluster to which you want to upload the schema.

  2. Apply the manifest:

    kubectl apply -f schema.yaml --namespace <namespace>

    When the manifest is applied, the schema will be created in your Redpanda cluster.

  3. Check the status of the Schema resource using the following command:

    kubectl get schema example-schema --namespace <namespace>
  4. Create an alias to simplify running rpk commands on your cluster:

    alias internal-rpk="kubectl --namespace <namespace> exec -i -t <pod-name> -c redpanda -- rpk"

    Replace <pod-name> with the name of a Pod that’s running Redpanda.

  5. Verify that the schema was created in Redpanda:

    internal-rpk registry subject list

    You should see example-schema in the output.

Schema examples

These examples demonstrate how to define schemas in Avro, Protobuf, and JSON Schema formats.

Create an Avro schema

avro-schema.yaml
# This manifest creates an Avro schema named "customer-profile" in the "basic" cluster.
# The schema defines a record with fields for customer ID, name, and age.
---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
  name: customer-profile
spec:
  cluster:
    clusterRef:
      name: basic
  schemaType: avro
  compatibilityLevel: Backward
  text: |
    {
      "type": "record",
      "name": "CustomerProfile",
      "fields": [
        { "type": "string", "name": "customer_id" },
        { "type": "string", "name": "name" },
        { "type": "int", "name": "age" }
      ]
    }

Create a Protobuf schema

proto-schema.yaml
# This manifest creates a Protobuf schema named "product-catalog" in the "basic" cluster.
# The schema defines a message "Product" with fields for product ID, name, price, and category.
---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
  name: product-catalog
spec:
  cluster:
    clusterRef:
      name: basic
  schemaType: protobuf
  compatibilityLevel: Backward
  text: |
    syntax = "proto3";

    message Product {
      int32 product_id = 1;
      string product_name = 2;
      double price = 3;
      string category = 4;
    }

Create a JSON schema

json-schema.yaml
# This manifest creates a JSON schema named "order-event" in the "basic" cluster.
# The schema requires an "order_id" (string) and a "total" (number) field, with no additional properties allowed.
---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
  name: order-event
spec:
  cluster:
    clusterRef:
      name: basic
  schemaType: json
  compatibilityLevel: None
  text: |
    {
      "$schema": "http://json-schema.org/draft-07/schema#",
      "type": "object",
      "properties": {
        "order_id": { "type": "string" },
        "total": { "type": "number" }
      },
      "required": ["order_id", "total"],
      "additionalProperties": false
    }

Configuration

The Schema resource in Redpanda offers various options to customize and control schema behavior. This section covers schema compatibility, schema references, and schema types, providing a detailed guide on using each of these features to maintain data integrity, manage dependencies, and facilitate schema evolution.

You can find all configuration options for the Schema resource in the CRD reference.

schema.yaml
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
  name: <subject-name> (1)
  namespace: <namespace> (2)
spec:
  cluster:
    clusterRef:
      name: <cluster-name> (3)
  schemaType: avro (4)
  compatibilityLevel: Backward (5)
  references: [] (6)
  text: | (7)
    {
      "type": "record",
      "name": "test",
      "fields": [
        { "type": "string", "name": "field1" },
        { "type": "int", "name": "field2" }
      ]
    }
1 Subject name: The name of the subject for the schema. When data formats are updated, a new version of the schema can be registered under the same subject, enabling backward and forward compatibility.
2 Namespace: The namespace in which to deploy the Schema resource. The Schema resource must be deployed in the same namespace as the Redpanda resource defined in clusterRef.name.
3 Cluster name: The name of the Redpanda resource that defines the Redpanda cluster to which you want to upload the schema.
4 Compatibility level: Defines the compatibility level for the schema. Options are Backward (default), BackwardTransitive, Forward, ForwardTransitive Full, FullTransitive, or None. See Choose a compatibility mode.
5 Schema type: Specifies the type of the schema. Options are avro (default) or protobuf. For JSON Schema, include "$schema": in the text to indicate the JSON Schema draft version. See Choose a schema type.
6 References: Any references you want to add to other schemas. If no references are needed, this can be an empty list (default). See Use schema references.
7 Schema body: The body of the schema, which defines the data structure.

Choose a schema type

Redpanda’s Schema Registry supports the following schema types:

  • Avro: A widely used serialization format in event-driven architectures.

  • Protobuf: Popular for defining data structures in gRPC APIs and efficient data serialization.

  • JSON Schema: Dynamic, schema-based validation for JSON documents.

If no type is specified, Redpanda defaults to Avro.

Choose a compatibility mode

Compatibility modes determine how schema versions within a subject can evolve without breaking existing data consumers. Redpanda supports the following compatibility levels:

  • None: Disables compatibility checks, allowing any schema change.

  • Backward: Consumers using the new schema (for example, version 10) can read data from producers using the previous schema (for example, version 9).

  • BackwardTransitive: Enforces backward compatibility across all versions, not just the latest.

  • Forward: Consumers using the previous schema (for example, version 9) can read data from producers using the new schema (for example, version 10).

  • ForwardTransitive: Ensures forward compatibility across all schema versions.

  • Full: Combines backward and forward compatibility, requiring that changes maintain compatibility in both directions. A new schema and the previous schema (for example, versions 10 and 9) are both backward and forward-compatible with each other.

  • FullTransitive: Enforces full compatibility across all schema versions.

For example, to set full compatibility, configure the Schema resource with:

apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
  name: fully-compatible-schema
  namespace: redpanda
spec:
  cluster:
    clusterRef:
      name: basic
  schemaType: avro
  compatibilityLevel: Full
  text: |
    {
      "type": "record",
      "name": "ExampleRecord",
      "fields": [
        { "type": "string", "name": "field1" },
        { "type": "int", "name": "field2" }
      ]
    }

Compatibility settings are essential for maintaining data consistency, especially when updating schemas over time.

Use schema references

For complex data structures, you can define schema references that allow one schema to reference another, enabling modular and reusable schema components. Schema references are helpful for shared data structures across topics like product information or user profiles, reducing redundancy.

This feature is supported for Avro and Protobuf schemas.

Define a schema reference using the references field. The reference includes the name, subject, and version of the referenced schema:

apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
  name: order-schema
  namespace: redpanda
spec:
  cluster:
    clusterRef:
      name: basic
  references:
    - name: product-schema
      subject: product
      version: 1
  text: |
    {
      "type": "record",
      "name": "Order",
      "fields": [
        { "name": "product", "type": "Product" }
      ]
    }

Update a schema

To update a schema, modify the Schema resource and apply the changes:

kubectl apply -f <manfiest-filename>.yaml --namespace <namespace>

Check schema version

Ensure the schema has been versioned by running:

kubectl get schema <subject-name> --namespace <namespace>

You can also check specific versions of the schema:

internal-rpk registry schema get --id 1
internal-rpk registry schema get --id 2

Delete a schema

To delete a schema, use the following command:

kubectl delete schema <subject-name> --namespace redpanda

Verify that the schema was deleted by checking the Redpanda Schema Registry:

internal-rpk registry subject list

Suggested reading

For more details on using schemas in Redpanda, see: