Golang SDK for Data Transforms

API reference documentation for Redpanda data transforms.

Data transforms

Package transform contains the core functions and types for transforming data within Redpanda.

Data transforms functions

OnRecordWritten

func OnRecordWritten(fn OnRecordWrittenCallback)

The OnRecordWritten function registers a callback of type OnRecordWrittenCallback, which is invoked when a record is written to the input topic.

The function should be called in a package’s main function to register the transform function that will be applied.


Data transforms types

OnRecordWrittenCallback

type OnRecordWrittenCallback func(e WriteEvent, w RecordWriter) error

The OnRecordWrittenCallback type is a callback to transform records after a write event happens in the input topic. It’s the type of the parameter for the OnRecordWritten function.

Record

type Record struct {
	// Key is an optional field.
	Key []byte
	// Value is the blob of data that is written to Redpanda.
	Value []byte
	// Headers are client specified key/value pairs that are
	// attached to a record.
	Headers []RecordHeader
	// Attrs is the attributes of a record.
	//
	// Output records should leave these unset.
	Attrs RecordAttrs
	// The timestamp associated with this record.
	//
	// For output records this can be left unset as it will
	// always be the same value as the input record.
	Timestamp time.Time
	// The offset of this record in the partition.
	//
	// For output records this field is left unset,
	// as it will be set by Redpanda.
	Offset int64
}

The Record type is a record that has been written to Redpanda.

RecordAttrs

type RecordAttrs struct {
	// contains filtered or unexported fields
}

RecordHeader

type RecordHeader struct {
	Key   []byte
	Value []byte
}

The RecordHeader type is an optional key/value pair that is passed along with records.

WriteEvent

type WriteEvent interface {
	// Access the record associated with this event
	Record() Record
}

The WriteEvent type contains information about a record that was written.

RecordWriter

type RecordWriter interface {
      // Write writes a record to the output topic.
      //
      // When writing a record, only the key, value and headers are
      // used other information like the timestamp will be overridden
      // by the broker.
      Write(Record) error
}

RecordWriter is an interface for writing transformed records to the destination topic.

Schema Registry client

Package sr is a Schema Registry client for usage within Redpanda data transforms.

Data transforms support reading schemas and writing schemas to Redpanda’s Schema Registry.

After a schema is obtained, a record may be decoded using the appropriate library. Using the TinyGo compiler and runtime, the following libraries for Apache Avro and Protocol Buffers are known to work:

Schema Registry variables

var (
	// ErrNotRegistered is returned from Serde when attempting to encode a
	// value or decode an ID that has not been registered, or when using
	// Decode with a missing new value function.
	ErrNotRegistered = errors.New("registration is missing for encode/decode")

	// ErrBadHeader is returned from Decode when the input slice is shorter
	// than five bytes, or if the first byte is not the magic 0 byte.
	ErrBadHeader = errors.New("5 byte header for value is missing or does not have the 0 magic byte")
)

Schema Registry functions

ExtractID

func ExtractID(b []byte) (int, error)

Extract the ID from the header of a Schema Registry encoded value.

Returns ErrBadHeader if the array is missing the leading magic byte or is too small.

Schema Registry types

ClientOpt

type ClientOpt interface {
	// contains filtered or unexported methods
}

ClientOpt is an option to configure a SchemaRegistryClient

MaxCacheEntries

func MaxCacheEntries(size int) ClientOpt

MaxCacheEntries configures how many entries to cache within the client.

By default the cache is unbounded. Use 0 to disable the cache.

Reference

type Reference struct {
	Name    string
	Subject string
	Version int
}

SchemaReference is a way for one schema to reference another schema. The details for how referencing is done are type specific; for example, JSON objects that use the key "$ref" can refer to another schema via URL. See Reference a schema.

Schema

type Schema struct {
	Schema     string
	Type       SchemaType
	References []Reference
}

Schema is a schema that can be registered within the Schema Registry.

SchemaRegistryClient

type SchemaRegistryClient interface {
	// LookupSchemaById looks up a schema via its global ID.
	LookupSchemaById(id int) (s *Schema, err error)
	// LookupSchemaByVersion looks up a schema via a subject for a specific version.
	//
	// Use version -1 to get the latest version.
	LookupSchemaByVersion(subject string, version int) (s *SubjectSchema, err error)
	// CreateSchema creates a schema under the given subject, returning the version and ID.
	//
	// If an equivalent schema already exists globally, that schema ID will be reused.
	// If an equivalent schema already exists within that subject, this will be a noop and the
	// existing schema will be returned.
	CreateSchema(subject string, schema Schema) (s *SubjectSchema, err error)
}

SchemaRegistryClient is a client for interacting with Redpanda’s Schema Registry.

The client provides caching out of the box, which can be configured with options.

NewClient

func NewClient(opts ...ClientOpt) (c SchemaRegistryClient)

NewClient creates a new SchemaRegistryClient with the specified options applied.

SchemaType

type SchemaType int

SchemaType is an enum for the different types of schemas that can be stored in the Schema Registry.

const (
	TypeAvro SchemaType = iota
	TypeProtobuf
	TypeJSON
)

Serde

type Serde[T any] struct {
	// contains filtered or unexported fields
}

Serde encodes and decodes values according to the Schema Registry wire format. A Serde itself does not perform schema auto-discovery and type auto-decoding. To aid in strong typing and validated encoding/decoding, you must register IDs and values.

To use a Serde for encoding, you must first preregister the schema IDs and values that you will encode. The latest registered ID that supports encoding is used to encode.

To use a Serde for decoding, you can either preregister the schema IDs and values that you will consume, or you can discover the schema every time you receive an ErrNotRegistered error from decode.

(*Serde[T]) AppendEncode

func (s *Serde[T]) AppendEncode(b []byte, v T) ([]byte, error)

AppendEncode appends an encoded value to b according to the schema registry wire format and returns it. If EncodeFn was not used, this returns ErrNotRegistered.

(*Serde[T]) Decode

func (s *Serde[T]) Decode(b []byte, v T) error

Decode decodes b into v. If the DecodeFn option was not used, this returns ErrNotRegistered.

Serde does not handle references in schemas. You must register the full decode function for any top-level ID, regardless of how many other schemas are referenced in the top-level ID.

(*Serde[T]) Encode

func (s *Serde[T]) Encode(v T) ([]byte, error)

Encode encodes a value according to the Schema Registry wire format and returns it. If EncodeFn was not used, this returns ErrNotRegistered.

(*Serde[T]) MustAppendEncode

func (s *Serde[T]) MustAppendEncode(b []byte, v T) []byte

MustAppendEncode returns the value of AppendEncode, panicking on error. This is a shortcut for when your encode function cannot error.

(*Serde[T]) MustEncode

func (s *Serde[T]) MustEncode(v T) []byte

MustEncode returns the value of Encode, panicking on error. This is a shortcut for when your encode function cannot error.

(*Serde[T]) Register

func (s *Serde[T]) Register(id int, opts ...SerdeOpt[T])

Register registers a schema ID and the value it corresponds to, as well as the encoding or decoding functions. Register functions depending on whether you are only encoding, only decoding, or both.

(*Serde[T]) SetDefaults

func (s *Serde[T]) SetDefaults(opts ...SerdeOpt[T])

SetDefaults sets default options to apply to every registered type. These options are always applied first, so you can override them as necessary when registering.

This can be useful if you always want to use the same encoding or decoding functions.

SerdeOpt

type SerdeOpt[T any] interface {
	// contains filtered or unexported methods
}

SerdeOpt is an option to configure a Serde.

AppendEncodeFn

func AppendEncodeFn[T any](fn func([]byte, T) ([]byte, error)) SerdeOpt[T]

AppendEncodeFn allows Serde to encode a value to an existing slice. This can be more efficient than EncodeFn; this function is used if it exists.

DecodeFn

func DecodeFn[T any](fn func([]byte, T) error) SerdeOpt[T]

DecodeFn allows Serde to decode into a value.

EncodeFn

func EncodeFn[T any](fn func(T) ([]byte, error)) SerdeOpt[T]

EncodeFn allows Serde to encode a value.

SubjectSchema

type SubjectSchema struct {
	Schema

	Subject string
	Version int
	ID      int
}

SchemaSubject is a schema along with the subject, version, and ID of the schema.