# Ingest data into Snowflake

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [connect-full.txt](https://docs.redpanda.com/connect-full.txt)

---
title: Ingest data into Snowflake
latest-connect-version: 4.93.0
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-redpanda-tag: v26.1.9
docname: snowflake_ingestion
page-component-name: connect
page-version: master
page-component-version: master
page-component-title: Connect
page-relative-src-path: snowflake_ingestion.adoc
page-edit-url: https://github.com/redpanda-data/rp-connect-docs/edit/main/modules/cookbooks/pages/snowflake_ingestion.adoc
description: Configure Redpanda Connect to ingest data from a Redpanda topic into Snowflake using Snowpipe Streaming.
page-git-created-date: "2025-01-28"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/connect/cookbooks/snowflake_ingestion.md -->

Configure a Redpanda Connect pipeline to generate and write data into a local Redpanda topic, and then ingest that data into [Snowflake](https://www.snowflake.com/en/) using [Snowpipe Streaming](https://docs.snowflake.com/en/user-guide/data-load-snowpipe-streaming-overview).

See also: [Ingest data into Snowflake](https://docs.redpanda.com/cloud-data-platform/develop/connect/cookbooks/snowflake_ingestion/) using Redpanda Cloud

## [](#prerequisites)Prerequisites

-   [`rpk` installed](https://docs.redpanda.com/current/get-started/rpk-install/)

-   A [Snowflake account](https://trial.snowflake.com/)

-   `openssl` command-line tool


## [](#set-up-your-redpanda-cluster)Set up your Redpanda cluster

Run [rpk container start](https://docs.redpanda.com/current/reference/rpk/rpk-container/rpk-container-start/) to create a local Redpanda cluster for development:

```none
rpk container start
```

Next, create a `demo_topic` to use as the data source for ingesting data into Snowflake:

```bash
rpk topic create demo_topic
```

Create a user with minimal [ACLs](https://docs.redpanda.com/current/manage/security/authorization/acl/) to run the ingestion pipeline into Snowflake:

```bash
rpk security user create ingestion_user --password Testing1234
```

Now that the user exists, give them read permissions to `demo_topic`, as well as full control over any consumer group with the prefix `redpanda_connect`:

```bash
rpk security acl create --allow-principal ingestion_user --operation read --topic demo_topic
rpk security acl create --allow-principal ingestion_user --resource-pattern-type prefixed --operation all --group redpanda_connect
```

## [](#set-up-your-snowflake-account)Set up your Snowflake account

Log in to your Snowflake account with a user who has the ACCOUNTADMIN role. Then, run the following SQL commands in a worksheet. They set up another user with minimal permissions to write data into a specified database and schema, ready for streaming data to Snowflake.

```sql
-- Set default values for multiple variables
SET PWD = 'Test1234567';
SET USER = 'STREAMING_USER';
SET DB = 'STREAMING_DB';
SET ROLE = 'REDPANDA_CONNECT';
SET WH = 'STREAMING_WH';
USE ROLE ACCOUNTADMIN;
-- Create users
CREATE USER IF NOT EXISTS IDENTIFIER($USER) PASSWORD=$PWD  COMMENT='STREAMING USER FOR REDPANDA CONNECT';
-- Create roles
CREATE OR REPLACE ROLE IDENTIFIER($ROLE);
-- Create the destination database and virtual warehouse
CREATE DATABASE IF NOT EXISTS IDENTIFIER($DB);
USE IDENTIFIER($DB);
CREATE OR REPLACE WAREHOUSE IDENTIFIER($WH) WITH WAREHOUSE_SIZE = 'SMALL';
-- Grant privileges
GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE IDENTIFIER($ROLE);
GRANT ROLE IDENTIFIER($ROLE) TO USER IDENTIFIER($USER);
GRANT OWNERSHIP ON DATABASE IDENTIFIER($DB) TO ROLE IDENTIFIER($ROLE);
GRANT USAGE ON WAREHOUSE IDENTIFIER($WH) TO ROLE IDENTIFIER($ROLE);
-- Set defaults
ALTER USER IDENTIFIER($USER) SET DEFAULT_ROLE=$ROLE;
ALTER USER IDENTIFIER($USER) SET DEFAULT_WAREHOUSE=$WH;
-- Run the following commands to find your account identifier. Copy it down for later use.
-- It will be something like `organization_name-account_name`
-- e.g. ykmxgak-wyb52636
WITH HOSTLIST AS
(SELECT * FROM TABLE(FLATTEN(INPUT => PARSE_JSON(SYSTEM$allowlist()))))
SELECT REPLACE(VALUE:host,'.snowflakecomputing.com','') AS ACCOUNT_IDENTIFIER
FROM HOSTLIST
WHERE VALUE:type = 'SNOWFLAKE_DEPLOYMENT_REGIONLESS';
```

### [](#create-an-rsa-key-pair)Create an RSA key pair

Create an [RSA key pair](https://docs.snowflake.com/en/user-guide/key-pair-auth) using `openssl` to authenticate Redpanda Connect to Snowflake. When you’re prompted to give an encryption password, record it for later.

```bash
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -passout pass:Testing123 -out rsa_key.p8
```

Create a public key. You’re prompted to enter your encryption password.

```bash
openssl rsa -in rsa_key.p8 -pubout -passout pass:Testing123 -out rsa_key.pub
```

To register the public key in Snowflake, remove the public key delimiters and output only the base64-encoded portion of the PEM file. Run the following bash command to print it:

```bash
cat rsa_key.pub | sed -e '1d' -e '$d' | tr -d '\n'
```

In the Snowflake worksheet, add the output of the bash command you just ran to the following SQL command and execute it:

```sql
use role accountadmin;
alter user streaming_user set rsa_public_key='< PubKeyWithoutDelimiters >';
```

### [](#create-a-schema-using-streaming_user)Create a schema using `streaming_user`

Log out of Snowflake and sign back in as the default user (`streaming_user`) with the associated password (default: `Test1234567`). You created these credentials in [Set up your Snowflake account](#set-up-your-snowflake-account).

Run the following SQL commands in a worksheet to create a schema (e.g. `STREAMING_SCHEMA`) in the default database (e.g. `STREAMING_DB`):

```sql
SET DB = 'STREAMING_DB';
SET SCHEMA = 'STREAMING_SCHEMA';
USE IDENTIFIER($DB);
CREATE OR REPLACE SCHEMA IDENTIFIER($SCHEMA);
```

## [](#create-a-pipeline-from-your-redpanda-cluster-to-snowflake)Create a pipeline from your Redpanda cluster to Snowflake

You can now create the pipeline locally and inject [secrets](https://docs.redpanda.com/connect/configuration/secrets/) using environment variables. First, create a `connect.yaml` file:

```yaml
input:
  # Read data from the local `demo_topic`
  kafka_franz:
    seed_brokers: ["localhost:9092"]
    topics: ["demo_topic"]
    consumer_group: "redpanda_connect_to_snowflake"
    tls: {enabled: true}
    checkpoint_limit: 4096
    sasl:
      - mechanism: SCRAM-SHA-256
        username: ingestion_user
        password: ${REDPANDA_PASS}
    # Define a batching policy. For this cookbook, create small batches,
    # but in a production environment use the largest file size you can.
    batching:
      count: 100 # Collect 10 messages before flushing
      period: 10s # or after 10 seconds, which ever comes first
output:
  snowflake_streaming:
    # Replace this placeholder with your account identifier
    account: "< OrgName-AccountName >"
    user: STREAMING_USER
    role: REDPANDA_CONNECT
    database: STREAMING_DB
    schema: STREAMING_SCHEMA
    table: STREAMING_DATA
    # Inject your private key and password
    private_key_file: rsa_key.p8
    private_key_pass: "${SNOWFLAKE_KEY_PASS}"
    schema_evolution:
      enabled: true
    max_in_flight: 1
```

Now run the pipeline, and any JSON data produced into the topic is streamed into Snowflake with minimal latency.

```bash
REDPANDA_PASS=Testing1234 SNOWFLAKE_KEY_PASS=Testing123 rpk connect run ./connect.yaml
```

You now can produce some data using `rpk` to test that everything works:

```bash
echo '{"animal":"redpanda","attributes":"cute","age":6}' | rpk topic produce demo_topic -f '%v\n'
echo '{"animal":"polar bear","attributes":"cool","age":13}' | rpk topic produce demo_topic -f '%v\n'
echo '{"animal":"unicorn","attributes":"rare","age":999}' | rpk topic produce demo_topic -f '%v\n'
```

The data produced into the `demo_topic` is consumed and streamed into Snowflake in seconds. Go back to the Snowflake worksheet and run the following query to see data arrive in Snowflake with the schema from the JSON data you produced.

```sql
SELECT * FROM STREAMING_DB.STREAMING_SCHEMA.STREAMING_DATA LIMIT 50;
```

See also:

-   The [`kafka_franz` input](https://docs.redpanda.com/connect/components/inputs/kafka_franz/)

-   The [`snowflake_streaming`](https://docs.redpanda.com/connect/components/outputs/snowflake_streaming/) output