Cloud

Query Iceberg-enabled Topics

With Iceberg-enabled topics, you can retain streaming data in your lakehouse beyond your Redpanda topic retention window and continue to query it with SQL. A single query returns both the live records still in the topic and the Iceberg-committed history that has aged out under topic retention, in one result.

For the streaming-only query path (live records only, without Iceberg history), see Query Streaming Topics.

After reading this page, you will be able to:

Describe the storage and catalog objects that Redpanda Cloud configures for Iceberg-enabled topic queries
Query both the live records and the Iceberg-committed history of a topic in a single result
Handle schema differences between a Redpanda topic and its Iceberg table

Prerequisites

A Redpanda BYOC cluster on AWS with Redpanda SQL enabled. See Enable Redpanda SQL.
The cluster’s iceberg_catalog_type property is set to rest. The object_storage catalog type does not support Iceberg queries from Redpanda SQL.
The cluster’s Iceberg REST catalog is configured. See Integrate with REST Catalogs for the supported REST catalog options (AWS Glue, Snowflake, Databricks Unity, Polaris, and others) and their configuration steps.
An Iceberg-enabled Redpanda topic with a schema (Protobuf, Avro, or JSON) registered in Schema Registry. To enable Iceberg on a topic, see About Iceberg Topics.
Connect to Redpanda SQL with psql or another PostgreSQL client. See Connect to Redpanda SQL.

How the query catalogs are set up

When you enable Redpanda SQL on a BYOC cluster with the Iceberg REST catalog configured, Cloud runs the following statements under the hood so you don’t have to. The result is a default_redpanda_catalog already linked to an Iceberg catalog, ready for CREATE TABLE against an Iceberg-enabled topic. The storage and Iceberg-catalog options match the cluster’s REST catalog configuration (endpoint, credentials, region).

A storage connection holds the object-storage credentials your cluster’s REST catalog uses:

CREATE STORAGE iceberg_storage TYPE = S3 WITH (
    endpoint = '<rest-catalog-storage-endpoint>',
    access_key_id = '<access-key-id>',
    secret_access_key = '<secret-access-key>',
    region = '<region>',
    path_style = 'true'
);

An Iceberg catalog points at the cluster’s REST catalog endpoint:

CREATE ICEBERG CATALOG lakehouse_catalog STORAGE iceberg_storage WITH (
    uri = '<rest-catalog-uri>',
    warehouse = '<warehouse>',
    auth_type = '<auth-type>'
);

The default_redpanda_catalog is linked to the Iceberg catalog. The pandaproxy_url option is required when a Redpanda catalog is linked:
```
ALTER REDPANDA CATALOG default_redpanda_catalog
USING CATALOG lakehouse_catalog WITH (
    pandaproxy_url = '<pandaproxy-url>'
);
```

To inspect the link on your cluster, run DESCRIBE REDPANDA CATALOG default_redpanda_catalog. For full option lists, see CREATE STORAGE, CREATE ICEBERG CATALOG, and CREATE REDPANDA CATALOG.

Map a topic as a SQL table

Define a SQL table against the Iceberg-enabled topic in default_redpanda_catalog. Replace orders and orders-value with your topic name and Schema Registry value subject:

CREATE TABLE default_redpanda_catalog=>orders WITH (
  topic = 'orders',
  schema_subject = 'orders-value'
);

Redpanda SQL reads the registered schema from Schema Registry to map fields to SQL columns.

When you query a table mapped from a Redpanda topic, Redpanda SQL also exposes two reserved metadata columns alongside your schema’s columns:

redpanda (a struct with topic-level metadata such as partition, offset, timestamp, key, and headers)
redpanda_raw (raw key and value bytes)

These column names are reserved. A topic schema with a top-level redpanda or redpanda_raw field conflicts with the metadata columns.

Query live and historical records together

Query the table using standard SELECT syntax. The query returns records from both the live Redpanda topic and the Iceberg-committed history in one result, with no overlap between them.

After any shape change to the Iceberg table or its corresponding Redpanda topic schema, run REFRESH <catalog>⇒<table> to update the schema view Redpanda SQL uses for query planning. Without a refresh, the next query against the table fails at planning time with Schema not found for Iceberg table '<table>'. REFRESH updates only the schema view, not data; you don’t run it between writes.

SELECT * FROM default_redpanda_catalog=>orders LIMIT 10;

Redpanda SQL plans the union internally, so you don’t write a UNION ALL.

Iceberg-committed data persists independently of Redpanda topic retention. Queries continue to return records past the Redpanda topic’s retention window, provided they were committed to Iceberg first.

Handle schema differences

A topic schema can evolve over time. You might add or remove fields in your Schema Registry value subject as your application changes. Redpanda writes new records to the Iceberg table forward-only: it adds columns to the Iceberg table when the topic schema widens, but it does not drop columns from the Iceberg table when the topic schema narrows. As a result, the Iceberg table can carry columns the current topic schema doesn’t have.

Redpanda SQL handles this divergence by treating the topic’s Schema Registry subject as canonical and filling in missing fields on the Iceberg side. The rules that apply to a query against a linked Redpanda catalog:

The result’s column order, count, and types come from the topic schema.
If the Iceberg table is missing a column that the topic schema has, Redpanda SQL returns NULL for that column on Iceberg-side rows. This is common after adding a new column to the topic schema, since earlier records already in Iceberg don’t have it.
If the Iceberg table has columns the topic schema does not, the query fails at planning time with the error Kafka schema must be a name-superset of Iceberg schema. This happens when a column is removed from the topic’s Schema Registry subject. To resolve, re-add the column to the topic’s Schema Registry subject so the topic schema covers every column present in the Iceberg table.

Iceberg type limitations

When Redpanda SQL maps an Iceberg-enabled topic as a SQL table, it reads the Iceberg schema and converts each column type to a SQL type. The following column types currently have no valid Redpanda SQL type mapping and cause CREATE TABLE or REFRESH to fail with an error.

Columns of type map are not supported, whether at the top level or nested inside a struct.
decimal columns with precision greater than 18 cannot be mapped. Iceberg allows decimal precision up to 38, but Redpanda SQL stores decimals as 64-bit integers with a maximum precision of 18. Decimals with precision 18 or lower work as expected.
A list whose element type is also a list is not supported. Simple lists (list of primitives or list of structs) work as expected.

These schema-mapping constraints apply only to columns in the Iceberg schema. Columns sourced from live topic records are not subject to these type restrictions.

Next steps

Query streaming topics: the streaming-only query path, without the Iceberg history layer.
Integrate with REST Catalogs: managed REST catalog options and their configuration.
About Iceberg Topics: how Iceberg topics work in Redpanda and how to enable them.
CREATE STORAGE: full reference for the storage statement.
CREATE ICEBERG CATALOG: full reference for the Iceberg catalog statement.
CREATE REDPANDA CATALOG: full reference for the Redpanda catalog statement, including the USING CATALOG clause.
Redpanda SQL Reference: supported SQL statements, clauses, data types, and functions.

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution

What do you think of this page?

Let us know more:

Let us contact you about your feedback: