Skip to main content
Version: 22.3

Protobuf Deserialization

If you have one or more topics that contain Protobuf serialized messages, you can configure Redpanda Console to deserialize the binary content into JSON, which makes the message human-readable. The resulting message can also be used in the programmable push filters like a JavaScript object.

Protobuf serialization is commonly used with Kafka clusters in two ways:

  • With schema registry (schema meta information gets embedded as part of the record's value)
  • To write Protobuf serialized content into Kafka topics without a binary or custom wrapper

This is a technical difference because most Kafka clients that serialize Protobuf messages put the serialized byte array into a binary wrapper that contains meta information such as the schema ID or the used prototypes. Therefore, the application that deserializes the Kafka records has to recognize the format. Redpanda Console supports both formats. The deserialization process requires Redpanda Console to be aware of the used .proto files as well as a mapping what prototype should be used for each topic. This information can either be sourced from the schema registry or by providing additional configuration so that the files can be pulled from the local filesystem or a GitHub repository.

info

To support imports, all prototypes are first registered in a proto registry, so that your imports can be resolved. You must also ensure that all imported prototypes are part of the repository. Standard types (such as Google's timestamp type) are included by default, so you don't need to add them.

Schema registry

Messages that have been serialized using Confluent's KafkaProtobufSerializer can only be deserialized if the schema registry is configured. Unlike other providers, the schema registry does not require you to set up mappings that define what topics use which prototypes. Instead, this information is inferred from the messages, and the schema registry finds the right prototype for deserialization.

The protobuf deserializer uses the same schema registry client that is configured under kafka.schemaRegistry. See Schema registry for information about how to set up the schema registry in Redpanda Console. A valid configuration for the Protobuf support with schema registry support looks like this:

kafka:
schemaRegistry:
enabled: true
urls: ["https://my-schema-registry.com"]
username: console
password: redacted # Or set using flags or env variable
protobuf:
enabled: true
schemaRegistry:
enabled: true # This tells the proto service to consider the schema registry when deserializing messages
refreshInterval: 5m # How often the compiled proto schemas in the cache should be updated

Local filesystem

You can provide all required proto files by mounting them from your filesystem. All files must use the .proto file extension. In the config.yaml file, configure Redpanda Console to search one or more paths for your provided proto files so that it can build a registry with all types:

kafka:
protobuf:
enabled: true
mappings:
- topicName: orders
valueProtoType: fake_model.Order # You can specify the proto type for the record key and/or value (just one will work too)
keyProtoType: package.Type
fileSystem:
enabled: true
refreshInterval: 5m # 5min is the default refresh interval
paths:
- /etc/protos

GitHub repository

To provide the proto files through a GitHub repository, first place the files in a directory. The directory doesn't matter, because Redpanda Console searches for all files with the .proto extension, up to a directory depth of five levels. Next, enable git in the config.yaml file, as shown in the following example.

kafka:
protobuf:
enabled: true
mappings: []
- topicName: xy
valueProtoType: fake_model.Order # You can specify the proto type for the record key and/or value (just one will work too)
keyProtoType: package.Type
git:
enabled: true
refreshInterval: 5m
repository:
url: https://github.com/redpanda-data/owlshop-protos.git
basicAuth:
enabled: true
username: token # API token from basic auth
password: redacted

Topic mapping configuration

If you don't use the schema registry for Protobuf deserialization, you must provide a mapping configuration so that Redpanda Console is aware of what Proto types it should use for each Kafka topic. Assume you have a Kafka topic called address-v1 and the respective address.proto file in your GitHub repository, which looks like the following:

syntax = "proto3";
package fake_models;

option go_package = "pkg/protobuf";

message Address {
int32 version = 1;
string id = 2;
message Customer {
string customer_id = 1;
string customer_type = 2;
}
}

The required mapping configuration looks like the following:

kafka:
protobuf:
enabled: true
mappings:
- topicName: address-v1
valueProtoType: fake_model.Address # The full prototype URL is required
# keyProtoType: The key is a plain string in Kafka, hence we don't have a prototype for the record's key