Converters and Serialization

Connectors are a translation layer working between Redpanda and the remote system.

For sink connectors the translation happens in the following phases:

  1. Converter deserializes data from Redpanda message format (for example JSON or Avro) to a universal in-memory connect data format.

  2. The in-memory connect data structure is translated by the connector to the data model of the remote system.

For source connectors it is vice versa, the phases are:

  1. Connector translates the data model from remote system format to the in-memory connect data structure.

  2. Converter serializes the data from a universal in-memory connect format to a Redpanda message.

Each Redpanda message is a key and value record. Record key and value converters are configured separately with the Redpanda message key format and Redpanda message value format properties. Key and value converters can be different.

If an external system requires structured data (like BigQuery or a SQL database), then you must provide data with a schema. Use the Avro, Protobuf, or JSON converter with a schema.

ByteArray converter

The ByteArray converter is the most primitive and high-throughput converter. Schema is ignored. This is the default converter type for managed connectors.

To use the converter, select the ByteArray option as a key or value message format.

String converter

The String converter is a high-throughput converter. Schema is ignored. All data is converted to a string.

To use the converter, select the String option as a key or value message format.

JSON converter

The JSON converter supports a JSON schema embedded in the message, where each message contains a schema. It results in a bigger message size. The connector needs a message schema to check message format.

To use the converter, select the JSON option as a key or value message format.

Example JSON message with embedded schema:

{
  "schema": {
    "type": "struct",
    "fields": [
      {
        "type": "int64",
        "optional": false,
        "field": "person_id"
      },
      {
        "type": "string",
        "optional": false,
        "field": "name"
      }
    ]
  },
  "payload": {
    "person_id": 1,
    "name": "Redpanda"
  }
}

If you consume JSON data with no message schema, the schema check for the connector must be disabled with the Message key JSON contains schema or Message value JSON contains schema option.

Avro converter

The Avro converter requires a schema in Schema Registry.

Avro supports primitive types and complex types, like records, enums, arrays, maps, and unions.

To specify a timestamp in an Avro schema for use with Kafka Connect, use:

{
    "name": "time1",
    "type": [
        "null",
        {
            "type": "long",
            "connect.version": 1,
            "connect.name": "org.apache.kafka.connect.data.Timestamp",
            "logicalType": "timestamp-millis"
        }
    ],
    "default": null
}

See also:

CloudEvents converter

The CloudEvents converter is specific to Debezium PostgreSQL and MySQL source connectors.

Protobuf converter Beta

The Protobuf converter requires a schema in Schema Registry. The converter only supports sink connectors. Source connectors are not supported.

To use the converter, select the Protobuf option as a key or value message format.

Set property keys

Redpanda Connectors use a set of <property.key>=<value> to set up its properties.

For example if you want to set the property topic.creation.enable to true, use topic.creation.enable=true in the property settings page.