Use Iceberg Topics with GCP BigLake

This feature requires an enterprise license. To get a trial license key or extend your trial period, generate a new trial license key. To purchase a license, contact Redpanda Sales.

If Redpanda has enterprise features enabled and it cannot find a valid license, restrictions apply.

This guide walks you through querying Redpanda topics as Iceberg tables stored in Google Cloud Storage, using a REST catalog integration with Google BigLake. In this guide, you do the following:

  • Create Google Cloud resources such as a storage bucket and service account

  • Grant permissions to the service account to access Iceberg data in the bucket

  • Create a catalog in BigLake

  • Configure the BigLake integration for your Redpanda cluster

  • Query the Iceberg data in Google BigQuery

For general information about Iceberg catalog integrations in Redpanda, see Use Iceberg Catalogs.

Check the BigLake product page for the latest status and availability of the REST Catalog API.

Prerequisites

  • A Google Cloud Platform (GCP) project.

    If you do not have permissions to manage GCP resources such as VMs, storage buckets, and service accounts in your project, ask your project owner to create or update them for you.

  • The gcloud CLI installed and configured for your GCP project.

  • BigLake API enabled for your GCP project.

  • Redpanda v25.3.1-rc2 or later. Your Redpanda cluster must be deployed on GCP VMs.

  • rpk installed or updated to the latest version.

  • Object storage configured for your cluster and Tiered Storage enabled for the topics for which you want to generate Iceberg tables.

    You also use the GCS bucket URI to set the warehouse location for the BigLake catalog.

Limitations

Multi-region bucket support

BigLake metastore does not support multi-region buckets. Use single-region buckets to store your Iceberg topics.

Catalog deletion

Currently, it is not possible to delete non-empty BigLake Iceberg catalogs through the BigLake interface. If you need to reconfigure your setup, create a new bucket or use the REST API to remove the existing catalog.

Topic names

BigLake does not support Iceberg table names that contain dots (.). When creating Iceberg topics in Redpanda that you plan to access through BigLake, either:

  • Use the iceberg_topic_name_dot_replacement cluster property to set a replacement string for dots in topic names. Ensure that the replacement value does not cause table name collisions. For example, current.orders and current_orders would both map to the same table name if you set the replacement to an underscore (_).

  • Ensure that the new topic names do not include dots.

You must also set the iceberg_dlq_table_suffix property to a value that does not include dots or tildes (~). See Configure Redpanda for Iceberg for the list of cluster properties to set when enabling the BigLake REST catalog integration.

Set up Google Cloud resources

Create a service account for Redpanda

If you don’t already have a Google Cloud service account to use, create a new service account that will be used by the VMs running Redpanda. Redpanda uses this account for writing data to Tiered Storage, Iceberg data and metadata, and for interacting with the BigLake catalog:

gcloud iam service-accounts create <service-account-name> --display-name "<display-name>"

Replace the placeholder values:

  • <service-account-name>: You can use a name that contains lowercase alphanumeric characters and dashes.

  • <display-name>: Enter a display name for the service account.

Grant required permissions

Grant the necessary permissions to your service account. To run the following commands, replace the placeholder values:

  • <service-account-name>: The name of your service account.

  • <bucket-name>: The name of your storage bucket.

    1. Grant the service account the Storage Object Admin role to access the bucket:

      gcloud storage buckets add-iam-policy-binding gs://<bucket-name> \
        --member="serviceAccount:<service-account-name>@$(gcloud config get-value project).iam.gserviceaccount.com" \
        --role="roles/storage.objectAdmin"
    2. Grant Service Usage Consumer and BigLake Editor roles for using the Iceberg REST catalog:

      gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
        --member="serviceAccount:<service-account-name>@$(gcloud config get-value project).iam.gserviceaccount.com" \
        --role="roles/serviceusage.serviceUsageConsumer"
      
      gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
        --member="serviceAccount:<service-account-name>@$(gcloud config get-value project).iam.gserviceaccount.com" \
        --role="roles/biglake.editor"

Create a BigLake catalog

Create a BigLake Iceberg REST catalog:

gcloud alpha biglake iceberg catalogs create \
    catalog-id <bucket-id>
    --project <gcp-project-id> \
    --catalog-type CATALOG_TYPE_GCS_BUCKET \
    --credential-mode CREDENTIAL_MODE_END_USER

Replace the placeholder values:

  • <bucket-name>: Use the name of your storage bucket as the catalog ID.

  • <gcp-project-id>: Your GCP project ID.

Optional: Deploy Redpanda quickstart on GCP

If you want to quickly test Iceberg topics in BigLake, you can deploy a test environment using the Redpanda Self-Managed quickstart. In this section, you create a new storage bucket for Tiered Storage and Iceberg data. You configure a Redpanda cluster for the BigLake catalog integration and deploy the cluster on a GCP Linux VM instance using Docker Compose.

If you already have a Redpanda cluster deployed on GCP, skip to Configure Redpanda for Iceberg.

Create a storage bucket

Create a new Google Cloud Storage bucket to store Iceberg data:

gcloud storage buckets create gs://<bucket-name> --location=<region>

Replace the placeholder values:

  • <bucket-name>: A globally unique name for your bucket.

  • <region>: The region where you want to create the bucket, for example, europe-west2.

Create VM instances

Create a VM instance to run Redpanda:

gcloud compute instances create <instance-name> \
    --zone=<zone> \
    --machine-type=e2-medium \
    --service-account=<service-account-name>@$(gcloud config get-value project).iam.gserviceaccount.com \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --create-disk=auto-delete=yes,boot=yes,device-name=<instance-name>,image=projects/debian-cloud/global/images/debian-12-bookworm-v20251014,mode=rw,size=20,type=pd-standard

Replace the placeholder values:

  • <instance-name>: A name for your VM instance.

  • <service-account-name>: The name of the service account you created earlier.

  • <zone>: The fully-qualified zone name, for example, europe-west2-a.

Install and configure Redpanda

  1. Connect to your VM instance:

    gcloud compute ssh --zone <zone> <instance-name>
  2. Install Docker and Docker Compose following the Docker installation guide for Debian.

    # Add Docker's official GPG key:
    sudo apt-get update
    sudo apt-get install ca-certificates curl
    sudo install -m 0755 -d /etc/apt/keyrings
    sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
    sudo chmod a+r /etc/apt/keyrings/docker.asc
    
    # Add the repository to Apt sources:
    echo \
      "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
      $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
      sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    
    sudo apt-get update
    
    # Install Docker Engine, CLI, and Compose
    sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
  3. Download the Redpanda Self-Managed quickstart files:

    mkdir redpanda-quickstart && cd redpanda-quickstart && \ (1)
    curl -sSL https://docs.redpanda.com/25.3-redpanda-quickstart.tar.gz | tar xzf - && \ (2)
    cd docker-compose (3)
    1 Create and navigate to the redpanda-quickstart directory.
    2 Download and extract the archive.
    3 Navigate to the Docker Compose configuration directory.
  4. Edit the bootstrap.yaml file to enable Tiered Storage and Iceberg features. Add or modify these sections:

    # Enable Tiered Storage
    cloud_storage_enabled: true
    cloud_storage_region: n/a # GCP does not require region to be set
    cloud_storage_api_endpoint: storage.googleapis.com
    cloud_storage_api_endpoint_port: 443
    cloud_storage_disable_tls: false
    cloud_storage_bucket: <bucket-name>
    cloud_storage_credentials_source: gcp_instance_metadata
    
    # Configure Iceberg REST catalog integration with BigLake
    iceberg_enabled: true
    iceberg_catalog_type: rest
    iceberg_rest_catalog_endpoint: https://biglake.googleapis.com/iceberg/v1/restcatalog
    iceberg_rest_catalog_oauth2_server_uri: https://oauth2.googleapis.com/token
    iceberg_rest_catalog_authentication_mode: gcp
    iceberg_rest_catalog_warehouse: gs://<bucket-name>/
    iceberg_rest_catalog_gcp_user_project: <gcp-project-id>
    iceberg_dlq_table_suffix: _dlq
    • Replace <bucket-name> with your bucket name and <gcp-project-id> with your Google Cloud project ID.

    • You must set the iceberg_dlq_table_suffix property to a value that does not include dots or tildes (~). The example above uses _dlq as the suffix for the dead-letter queue (DLQ) table.

    If you edit bootstrap.yml, you can skip the cluster configuration step in Configure Redpanda for Iceberg and proceed to the next step in that section to enable Iceberg for a topic.
  5. Start Redpanda:

    docker compose up -d
  6. Install and configure rpk:

    sudo apt-get install unzip
    
    curl -LO https://github.com/redpanda-data/redpanda/releases/latest/download/rpk-linux-amd64.zip &&
      mkdir -p ~/.local/bin &&
      export PATH="~/.local/bin:$PATH" &&
      unzip rpk-linux-amd64.zip -d ~/.local/bin/
    
    rpk profile create quickstart --from-profile rpk-profile.yaml

Configure Redpanda for Iceberg

  1. Edit your cluster configuration to set the iceberg_enabled property to true, and set the catalog integration properties listed in the example below.

    Run rpk cluster config edit to update these properties:

    iceberg_enabled: true
    iceberg_catalog_type: rest
    iceberg_rest_catalog_endpoint: https://biglake.googleapis.com/iceberg/v1/restcatalog
    iceberg_rest_catalog_oauth2_server_uri: https://oauth2.googleapis.com/token
    iceberg_rest_catalog_authentication_mode: gcp
    iceberg_rest_catalog_warehouse: gs://<bucket-name>/
    iceberg_rest_catalog_gcp_user_project: <gcp-project-id>
    iceberg_dlq_table_suffix: _dlq
    • Replace <bucket-name> with your bucket name and <gcp-project-id> with your Google Cloud project ID.

    • You must set the iceberg_dlq_table_suffix property to a value that does not include dots or tildes (~). The example above uses _dlq as the suffix for the dead-letter queue (DLQ) table.

  2. If you change the configuration for a running cluster, you must restart that cluster now.

  3. Enable the REST catalog integration for a topic by configuring the topic property redpanda.iceberg.mode. The following examples show how to use rpk to either create a new topic or alter the configuration for an existing topic and set the Iceberg mode to key_value. The key_value mode creates a two-column Iceberg table for the topic, with one column for the record metadata including the key, and another binary column for the record’s value. See Specify Iceberg Schema for more details on Iceberg modes.

    Create a new topic and set redpanda.iceberg.mode:
    rpk topic create <topic-name> --topic-config=redpanda.iceberg.mode=key_value
    Set redpanda.iceberg.mode for an existing topic:
    rpk topic alter-config <topic-name> --set redpanda.iceberg.mode=key_value

    If you’re using the Self-managed quickstart for testing, your Redpanda cluster includes a transactions topic with data in it, and a sample schema in the Schema Registry. To enable Iceberg for the transactions topic, run:

    rpk topic alter-config transactions --set redpanda.iceberg.mode=value_schema_latest:subject=transactions

Query Iceberg topics in BigQuery

  1. Navigate to the BigQuery console.

  2. Query your Iceberg topic using SQL. For example, to query the transactions topic:

    SELECT
        *
    FROM `<bucket-name>>redpanda`.transactions
    ORDER BY
        redpanda.timestamp DESC
    LIMIT 10

    Replace <bucket-name> with your bucket name.

Your Redpanda topics are now available as Iceberg tables in BigLake, allowing you to run analytics queries directly on your streaming data.

Optional: Clean up resources

When you’re finished with the quickstart example, you can clean up the resources you created:

# Delete VM instances
gcloud compute instances delete <instance-name> --zone=<zone>

# Delete the storage bucket
gcloud storage buckets delete gs://<bucket-name>

# Delete the service account
gcloud iam service-accounts delete <service-account-name>@$(gcloud config get-value project).iam.gserviceaccount.com
Manually delete the BigLake catalog using the REST API.