avro

Consume a stream of Avro OCF datum.

  • Common

  • Advanced

scanners:
  avro:
scanners:
  avro:
    raw_json: false

Avro JSON format

This scanner creates documents formatted as Avro JSON when decoding with Avro schemas. In this format, the value of a union is encoded in JSON as follows:

  • If the union’s type is null, it is encoded as a JSON null.

  • Otherwise, the union is encoded as a JSON object with one name/value pair. The "name" is the type’s name and the "value" is the recursively encoded value. For Avro’s named types (record, fixed or enum), the user-specified name is used. For other types, the type name is used.

For example, the union schema ["null","string","Transaction"], where Transaction is a record name, would encode:

  • The null as a JSON null

  • The string "a" as {"string": "a"}

  • A Transaction instance as {"Transaction": {…​}}, where {…​} indicates the JSON encoding of a Transaction instance

Alternatively, you can create documents in standard/raw JSON format by setting the field raw_json to true.

Metadata

This scanner emits the following metadata for each message:

  • The @avro_schema field: The canonical Avro schema.

  • The @avro_schema_fingerprint field: The schema ID or fingerprint.

Fields

raw_json

Whether to decode messages into normal JSON rather than Avro JSON. When true, this unwraps union values (bare values instead of {"type": value} wrappers).

Type: bool

Default: false