# Specify Iceberg Schema

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: Specify Iceberg Schema
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: iceberg/specify-iceberg-schema
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: iceberg/specify-iceberg-schema.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/manage/pages/iceberg/specify-iceberg-schema.adoc
description: Learn about supported Iceberg modes and how you can integrate schemas with Iceberg topics.
page-git-created-date: "2025-07-31"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/manage/iceberg/specify-iceberg-schema.md -->

In [Iceberg-enabled clusters](https://docs.redpanda.com/cloud-data-platform/manage/iceberg/about-iceberg-topics/#enable-iceberg-integration), the `redpanda.iceberg.mode` topic property determines how Redpanda maps topic data to the Iceberg table structure. You can have the generated Iceberg table match the structure of a schema in the Schema Registry, or you can use the `key_value` mode where Redpanda stores the record values as-is in the table.

## [](#supported-iceberg-modes)Supported Iceberg modes

Redpanda supports the following modes for Iceberg topics:

### [](#key_value)key_value

Creates an Iceberg table using a simple schema, consisting of two columns, one for the record metadata including the key, and another binary column for the record’s value.

### [](#value_schema_id_prefix)value_schema_id_prefix

Creates an Iceberg table whose structure matches the Redpanda schema for the topic, with columns corresponding to each field. You must register a schema in the [Schema Registry](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-overview/) and producers must write to the topic using the Schema Registry wire format.

In the [Schema Registry wire format](https://docs.redpanda.com/cloud-data-platform/manage/schema-reg/schema-reg-overview/#wire-format), a "magic byte" and schema ID are embedded in the message payload header. Producers to the topic must use the wire format in the serialization process so Redpanda can determine the schema used for each record, use the schema to define the Iceberg table, and store the topic values in the corresponding table columns.

### [](#value_schema_latest)value_schema_latest

Creates an Iceberg table whose structure matches the latest schema registered for the subject in the Schema Registry. You must register a schema in the Schema Registry.

Producers cannot use the wire format in `value_schema_latest` mode. Redpanda expects the serialized message as-is without the magic byte or schema ID prefix in the record value.

> 📝 **NOTE**
>
> The `value_schema_latest` mode is not compatible with the [`rpk topic produce`](#reference:rpk/rpk-topic/rpk-topic-produce) command which embeds the wire format header. You must use your own producer code to produce to topics in `value_schema_latest` mode.

The latest schema is cached periodically. The cache period is defined by the cluster property `iceberg_latest_schema_cache_ttl_ms` (default: 5 minutes).

### [](#disabled)disabled

Default for `redpanda.iceberg.mode`. Disables writing to an Iceberg table for the topic.

> 📝 **NOTE**
>
> The following modes are compatible with producing to an Iceberg topic using Redpanda Console:
>
> -   `key_value`
>
> -   Starting in version 25.2, `value_schema_latest` with a JSON schema
>
>
> Otherwise, records may fail to write to the Iceberg table and instead write to the [dead-letter queue](https://docs.redpanda.com/cloud-data-platform/manage/iceberg/iceberg-troubleshooting/#dead-letter-queue).

## [](#configure-iceberg-mode-for-a-topic)Configure Iceberg mode for a topic

You can set the Iceberg mode for a topic when you create the topic, or you can update the mode for an existing topic.

Option 1. Create a new topic and set `redpanda.iceberg.mode`:

```bash
rpk topic create <topic-name> --topic-config=redpanda.iceberg.mode=<iceberg-mode>
```

Option 2. Set `redpanda.iceberg.mode` for an existing topic:

```bash
rpk topic alter-config <topic-name> --set redpanda.iceberg.mode=<iceberg-mode>
```

### [](#override-value-schema-latest-default)Override `value_schema_latest` default

In `value_schema_latest` mode, you only need to set the property value to the string `value_schema_latest`. This enables the default behavior of `value_schema_latest` mode, which determines the subject for the topic using the TopicNameStrategy. For example, if your topic is named `sensor` the schema is looked up in the `sensor-value` subject. For Protobuf data, the default behavior also deserializes records using the first message defined in the corresponding Protobuf schema stored in the Schema Registry.

If you use a different strategy other than the topic name to derive the subject name, you can override the default behavior of `value_schema_latest` mode and explicitly set the subject name.

To override the default behavior, use the following optional syntax:

```bash
value_schema_latest:subject=<subject-name>,protobuf_name=<protobuf-message-full-name>
```

-   For both Avro and Protobuf, specify a different subject name by using the key-value pair `subject=<subject-name>`, for example `value_schema_latest:subject=sensor-data`.

-   For Protobuf only:

    -   Specify a different message definition by using a key-value pair `protobuf_name=<message-full-name>`. You must use the fully qualified name, which includes the package name, for example, `value_schema_latest:protobuf_name=com.example.manufacturing.SensorData`.

    -   To specify both a different subject and message definition, separate the key-value pairs with a comma, for example: `value_schema_latest:subject=my_protobuf_schema,protobuf_name=com.example.manufacturing.SensorData`.


    > 📝 **NOTE**
    >
    > If you don’t specify the fully qualified Protobuf message name, Redpanda pauses the data translation to the Iceberg table until you fix the topic misconfiguration.


## [](#how-iceberg-modes-translate-to-table-format)How Iceberg modes translate to table format

Redpanda generates an Iceberg table with the same name as the topic. In each mode, Redpanda writes to a `redpanda` table column that stores a single Iceberg [struct](https://iceberg.apache.org/spec/#nested-types) per record, containing nested columns of the metadata from each record, including the record key, headers, timestamp, the partition it belongs to, and its offset.

For example, if you produce to a topic `ClickEvent` according to the following Avro schema:

```avro
{
    "type": "record",
    "name": "ClickEvent",
    "fields": [
        {
            "name": "user_id",
            "type": "int"
        },
        {
            "name": "event_type",
            "type": "string"
        },
        {
            "name": "ts",
            "type": "string"
        }
    ]
}
```

The `key_value` mode writes to the following table format:

```sql
CREATE TABLE ClickEvent (
    redpanda struct<
        partition:      integer,
        timestamp:      timestamptz,
        offset:         long,
        headers:        array<struct<key: string, value: binary>>,
        key:            binary,
        timestamp_type: integer
    >,
    value binary
)
```

Use `key_value` mode if you want to use the Iceberg data in its semi-structured format.

The `value_schema_id_prefix` and `value_schema_latest` modes can use the schema to translate to the following table format:

```sql
CREATE TABLE ClickEvent (
    redpanda struct<
        partition: integer,
        timestamp:      timestamptz,
        offset:         long,
        headers:        array<struct<key: string, value: binary>>,
        key:            binary,
        timestamp_type: integer
    >,
    user_id integer NOT NULL,
    event_type string,
    ts string
)
```

As you produce records to the topic, the data also becomes available in object storage for Iceberg-compatible clients to consume. You can use the same analytical tools to [read the Iceberg topic data](https://docs.redpanda.com/cloud-data-platform/manage/iceberg/query-iceberg-topics/) in a data lake as you would for a relational database.

If Redpanda fails to translate the record to the columnar format as defined by the schema, it writes the record to a dead-letter queue (DLQ) table. See [Troubleshoot Iceberg Topics](https://docs.redpanda.com/cloud-data-platform/manage/iceberg/iceberg-troubleshooting/) for more information.

> 📝 **NOTE**
>
> You cannot use schemas to parse or decode record keys for Iceberg. The record keys are always stored in binary format in the `redpanda.key` column.

### [](#schema-types-translation)Schema types translation

Redpanda supports direct translations of the following types to Iceberg value domains:

#### Avro

| Avro type | Iceberg type |
| --- | --- |
| boolean | boolean |
| int | int |
| long | long |
| float | float |
| double | double |
| bytes | binary |
| string | string |
| record | struct |
| array | list |
| map | map |
| fixed | fixed* |
| decimal | decimal |
| uuid | uuid* |
| date | date |
| time | time* |
| timestamp | timestamp |

\*These types are not currently supported in Unity Catalog managed Iceberg tables.

There are some cases where the Avro type does not map directly to an Iceberg type and Redpanda applies the following transformations:

-   Enums are translated into the Iceberg `string` type.

-   Different flavors of time (such as `time-millis`) and timestamp (such as `timestamp-millis`) types are translated to the same Iceberg `time` and `timestamp` types, respectively.

-   Avro unions are flattened to Iceberg structs with optional fields. For example:

    -   The union `["int", "long", "float"]` is represented as an Iceberg struct `struct<0 INT NULLABLE, 1 LONG NULLABLE, 2 FLOAT NULLABLE>`.

    -   The union `["int", null, "float"]` is represented as an Iceberg struct `struct<0 INT NULLABLE, 1 FLOAT NULLABLE>`.


-   Two-field unions that contain `null` are represented as a single optional field only (no struct). For example, the union `["null", "long"]` is represented as `long`.


Some Avro types are not supported:

-   The Avro `duration` logical type is ignored.

-   The Avro `null` type is ignored and not represented in the Iceberg schema.

-   Recursive types are not supported.

#### Protobuf

| Protobuf type | Iceberg type |
| --- | --- |
| bool | boolean |
| double | double |
| float | float |
| int32 | int |
| sint32 | int |
| int64 | long |
| sint64 | long |
| sfixed32 | int |
| sfixed64 | long |
| string | string |
| bytes | binary |
| map | map |
| message | struct |

There are some cases where the Protobuf type does not map directly to an Iceberg type and Redpanda applies the following transformations:

-   Repeated values are translated into Iceberg `list` types.

-   Enums are translated into the Iceberg `string` type.

-   `uint32` and `fixed32` are translated into Iceberg `long` types as that is the existing semantic for unsigned 32-bit values in Iceberg.

-   `uint64` and `fixed64` values are translated into their Base-10 string representation.

-   `google.protobuf.Timestamp` is translated into `timestamp` in Iceberg.


Recursive types are not supported.

#### JSON Schema

Requirements:

-   Only JSON Schema Draft-07 is currently supported.

-   You must declare the JSON Schema dialect using the `$schema` keyword, for example `"$schema": "http://json-schema.org/draft-07/schema#"`.

-   You must use a JSON Schema that constrains JSON documents to a strict type so Redpanda can translate to Iceberg. In most cases this means each subschema uses the `type` keyword, but a subschema can also use `$ref` if the referenced schema resolves to a strict type.


Valid JSON Schema example

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "productId": {
      "type": "integer"
    },
    "tags": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}
```

| JSON type | Iceberg type | Notes |
| --- | --- | --- |
| array | list | The keywords items and additionalItems must be used to constrain element types. |
| boolean | boolean |  |
| null |  | The null type is only supported as a nullability marker, either in a type array (for example, ["string", "null"]) or in an exclusive oneOf nullable pattern. |
| number | double |  |
| integer | long |  |
| string | string | The format keyword can be used for custom Iceberg types. See format annotation translation for details. |
| object | struct or map | Use properties to define struct fields and constrain their types. additionalProperties: false is supported for closed objects.If additionalProperties contains a schema, it translates to an Iceberg map<string, T>.You cannot combine properties and additionalProperties in an object if additionalProperties is set to a schema. |
| format value | Iceberg type |
| --- | --- |
| date-time | timestamptz |
| date | date |
| time | time |

The following keywords have specific behavior:

-   The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to unknown keywords are not supported. A root-level `$ref` schema is not supported.

-   The `oneOf` keyword is supported only for the nullable serializer pattern where exactly one branch is `{"type":"null"}` and the other branch is a non-null schema (`T|null`).

-   In Iceberg output, Redpanda writes all fields as nullable regardless of serializer nullability annotations.


The following are not supported for JSON Schema:

-   The `$dynamicRef` keyword

-   The `default` keyword

-   Conditional typing (`if`, `then`, `else`, `dependencies` keywords)

-   Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)

-   Dynamic object members with the `patternProperties` keyword

-   The `additionalProperties` keyword when set to `true`