Protobuf Deserialization

If you have one or more topics that contain Protobuf serialized messages, you can configure Redpanda Console to deserialize the binary content into JSON, which makes the message human-readable. The resulting message can also be used in the programmable push filters like a JavaScript object.

Protobuf serialization is commonly used with Kafka clusters in two ways:

  • With schema registry (schema meta information gets embedded as part of the record’s value)

  • To write Protobuf serialized content into Kafka topics without a binary or custom wrapper

This is a technical difference because most Kafka clients that serialize Protobuf messages put the serialized byte array into a binary wrapper that contains meta information such as the schema ID or the used prototypes. Therefore, the application that deserializes the Kafka records has to recognize the format. Redpanda Console supports both formats. The deserialization process requires Redpanda Console to be aware of the used .proto files as well as a mapping what prototype should be used for each topic. This information can either be sourced from the schema registry or by providing additional configuration so that the files can be pulled from the local file system or a GitHub repository.

To support imports, all prototypes are first registered in a proto registry, so that your imports can be resolved. You must also ensure that all imported prototypes are part of the repository. Standard types (such as Google’s timestamp type) are included by default, so you don’t need to add them.

Schema Registry

Messages that have been serialized using Confluent’s KafkaProtobufSerializer can only be deserialized if the schema registry is configured. Unlike other providers, the schema registry does not require you to set up mappings that define what topics use which prototypes. Instead, this information is inferred from the messages, and the schema registry finds the right prototype for deserialization.

The protobuf deserializer uses the same schema registry client that is configured under kafka.schemaRegistry. See Schema Registry for information about how to set up the schema registry in Redpanda Console. A valid configuration for the Protobuf support with schema registry support looks like this:

kafka:
  schemaRegistry:
    enabled: true
    urls: ["https://my-schema-registry.com"]
    username: console
    password: redacted # Or set using flags or env variable
  protobuf:
    enabled: true
    schemaRegistry:
      enabled: true # This tells the proto service to consider the schema registry when deserializing messages
      refreshInterval: 5m # How often the compiled proto schemas in the cache should be updated

Local file system

You can provide all required proto files by mounting them from your file system. All files must use the .proto file extension. In the config.yaml file, configure Redpanda Console to search one or more paths for your provided proto files so that it can build a registry with all types:

kafka:
  protobuf:
    enabled: true
    mappings:
      - topicName: orders
        valueProtoType: fake_model.Order # You can specify the proto type for the record key and/or value (just one will work too)
        keyProtoType: package.Type
    fileSystem:
      enabled: true
      refreshInterval: 5m # 5min is the default refresh interval
      paths:
        - /etc/protos

GitHub repository

To provide the proto files through a GitHub repository, first place the files in a directory. The directory doesn’t matter, because Redpanda Console searches for all Proto files up to a directory depth of five levels. To use multiple different import paths, set the importPaths property. For example, if your repository includes third-party Protobuf types, such as Google’s types, that are in a different directory to your own types, set the importPaths property to the paths of both directories.

Enable git in the config.yaml file, as shown in the following example.

kafka:
  protobuf:
    enabled: true
    mappings: []
      - topicName: xy
        valueProtoType: fake_model.Order # You can specify the proto type for the record key and/or value (just one will work too)
        keyProtoType: package.Type
    # importPaths is a list of paths from which to import Proto files into Redpanda Console.
    # Paths are relative to the root directory.
    # The `git` configuration must be enabled to use this feature.
    importPaths: []
    git:
      enabled: true
      refreshInterval: 5m
      repository:
        url: https://github.com/redpanda-data/owlshop-protos.git
      basicAuth:
        enabled: true
        username: token # API token from basic auth
        password: redacted

Topic mapping configuration

If you don’t use the schema registry for Protobuf deserialization, you must provide a mapping configuration so that Redpanda Console is aware of what Proto types it should use for each Kafka topic. Assume you have a Kafka topic called address-v1 and the respective address.proto file in your GitHub repository, which looks like the following:

syntax = "proto3";
package fake_models;

option go_package = "pkg/protobuf";

message Address {
  int32 version = 1;
  string id = 2;
  message Customer {
    string customer_id = 1;
    string customer_type = 2;
  }
}

The required mapping configuration looks like the following:

kafka:
  protobuf:
    enabled: true
    mappings:
    - topicName: address-v1
        valueProtoType: fake_model.Address # The full prototype URL is required
        # keyProtoType: The key is a plain string in Kafka, hence we don't have a prototype for the record's key