Docs Self-Managed Manage Iceberg Use Iceberg Catalogs You are viewing the Self-Managed v25.1 beta documentation. We welcome your feedback at the Redpanda Community Slack #beta-feedback channel. To view the latest available version of the docs, see v24.3. Use Iceberg Catalogs Beta This feature requires an enterprise license. To get a trial license key or extend your trial period, generate a new trial license key. To purchase a license, contact Redpanda Sales. If Redpanda has enterprise features enabled and it cannot find a valid license, restrictions apply. To read from the Redpanda-generated Iceberg table, your Iceberg-compatible client or tool needs access to the catalog to retrieve the table metadata and know the current state of the table. The catalog provides the current table metadata, which includes locations for all the table’s data files. You can configure Redpanda to either connect to a REST-based catalog, or use a file-system based catalog. For production deployments, Redpanda recommends using an external REST catalog to manage Iceberg metadata. This enables built-in table maintenance, safely handles multiple engines and tools accessing tables at the same time, facilitates data governance, and maximizes data discovery. However, if it is not possible to use a REST catalog, you may use the file-system based catalog, which does not require you to maintain a separate service to access the Iceberg data. In either case, you use the catalog to load, query, or refresh the Iceberg table as you produce to the Redpanda topic. See the documentation for your query engine or Iceberg-compatible tool for specific guidance on adding the Iceberg tables to your data warehouse or lakehouse using the catalog. After you have selected a catalog type at the cluster level and enabled the Iceberg integration for a topic, you cannot switch to another catalog type. Connect to a REST catalog Redpanda supports connecting to an Iceberg REST catalog, using the standard REST API supported by many catalog providers. Use this catalog integration type with REST-enabled Iceberg catalog services, such as Databricks Unity and Snowflake Open Catalog. To connect to a REST catalog, set the following cluster configuration properties: iceberg_catalog_type: rest iceberg_rest_catalog_endpoint: The endpoint URL for your Iceberg catalog, which you either manage directly, or is managed by an external catalog service. iceberg_rest_catalog_client_id: The ID to connect to the REST catalog. iceberg_rest_catalog_client_secret: The secret data to connect to the REST catalog. For REST catalogs that use self-signed certificates, also configure these properties: iceberg_rest_catalog_trust_file: The path to a file containing a certificate chain to trust for the REST catalog. iceberg_rest_catalog_crl_file: The path to the certificate revocation list for the specified trust file. See Cluster Configuration Properties for the full list of cluster properties to configure for a catalog integration. Example REST catalog configuration For example, if you have Redpanda cluster configuration properties set to connect to a REST catalog: iceberg_catalog_type: rest iceberg_rest_catalog_endpoint: http://catalog-service:8181 iceberg_rest_catalog_client_id: <rest-connection-user> iceberg_rest_catalog_client_secret: <rest-connection-password> And you use Apache Spark as a processing engine, configured to use a catalog named streaming: spark.sql.catalog.streaming = org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.streaming.type = rest spark.sql.catalog.streaming.uri = http://catalog-service:8181 # You may need to configure additional properties based on your object storage provider. # See https://iceberg.apache.org/docs/latest/spark-configuration/#catalog-configuration and https://spark.apache.org/docs/latest/configuration.html # For example, for AWS S3: # spark.sql.catalog.streaming.io-impl = org.apache.iceberg.aws.s3.S3FileIO # spark.sql.catalog.streaming.warehouse = s3://<bucket-name>/ # spark.sql.catalog.streaming.s3.endpoint = http://<s3-uri> Redpanda recommends setting credentials in environment variables so Spark can securely access your Iceberg data in object storage. For example, for AWS, use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Using Spark SQL, you can query the Iceberg table directly by specifying the catalog name, the namespace, and the table name: SELECT * FROM streaming.redpanda.<table-name>; The Iceberg table name is the name of your Redpanda topic. Redpanda puts the Iceberg table into a namespace called redpanda, creating the namespace if necessary. The Spark engine can use the REST catalog to automatically discover the topic’s Iceberg table. Depending on your processing engine, you may need to also create a table in the engine to point the data lakehouse to the table location in the catalog. For an example, see Query Iceberg Topics using Snowflake and Open Catalog. Integrate file-system based catalog (object_storage) By default, Iceberg topics use the file-system based catalog (iceberg_catalog_type cluster configuration set to object_storage). Redpanda stores the table metadata in HadoopCatalog format in the same object storage bucket or container as the data files. If using the object_storage catalog type, you provide the object storage URI of the table’s metadata.json file to an Iceberg client so it can access the catalog and data files for your Redpanda Iceberg tables. Example file-system based catalog configuration To configure Apache Spark to use a file system-based catalog, specify at least the following properties: spark.sql.catalog.streaming = org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.streaming.type = hadoop # URI for table metadata: AWS S3 example spark.sql.catalog.streaming.warehouse = s3a://<bucket-name>/redpanda-iceberg-catalog # You may need to configure additional properties based on your object storage provider. # See https://iceberg.apache.org/docs/latest/spark-configuration/#spark-configuration and https://spark.apache.org/docs/latest/configuration.html # For example, for AWS S3: # spark.hadoop.fs.s3.impl = org.apache.hadoop.fs.s3a.S3AFileSystem # spark.hadoop.fs.s3a.endpoint = http://<s3-uri> # spark.sql.catalog.streaming.s3.endpoint = http://<s3-uri> Redpanda recommends setting credentials in environment variables so Spark can securely access your Iceberg data in object storage. For example, for AWS, use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Depending on your processing engine, you may need to also create a new table to point the data lakehouse to the table location. Specify metadata location The iceberg_catalog_base_location property stores the base path for the file-system based catalog if using the object_storage catalog type. The default value is redpanda-iceberg-catalog. Do not change the iceberg_catalog_base_location value after you have enabled Iceberg integration for a topic. Next steps Query Iceberg Topics Query Iceberg Topics using Snowflake and Open Catalog Suggested labs Redpanda Iceberg Docker Compose ExampleSearch all labs Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution About Iceberg Topics Query Iceberg Topics