Create a Snowflake Sink Connector

You can use the Snowflake Sink connector to ingest and store Redpanda structured data into a Snowflake database for analytics and decision-making.

Prerequisites

Before you can create a Snowflake Sink connector in the Redpanda Cloud, you must:

  1. Create a role for use by Kafka Connect.

  2. Create a key pair for authentication.

  3. Create a database to hold the data you intend to stream from Redpanda Cloud messages.

Limitations

Refer to the Snowflake Kafka Connector Limitations documentation for details.

Create a Snowflake Sink connector

To create a Snowflake Sink connector:

  1. In Redpanda Cloud, click Connectors in the navigation menu, and then click Create Connector.

  2. Select Export to Snowflake.

  3. On the Create Connector page, specify the following required connector configuration options:

    Property name Property key Description

    Topics to export

    topics

    A comma-separated list of the cluster topics you want to export to Snowflake.

    Topics regex

    topics.regex

    Java regular expression of topics to replicate. For example: specify .* to replicate all available topics in the cluster. Applicable only when Use regular expressions is selected.

    Snowflake URL name

    snowflake.url.name

    The Snowflake URL to be used for the connection.

    Snowflake database name

    snowflake.database.name

    The Snowflake database name to be used for the exported data.

    Snowflake user name

    snowflake.user.name

    The name of the user who created the key pair.

    Snowflake private key

    snowflake.private.key

    The private key name for the Snowflake user.

    Snowflake private key passphrase

    snowflake.private.key.passphrase

    (Optional) If created and encrypted, the passphrase of the private key.

    Snowflake role name

    snowflake.role.name

    The name of the role created in Prerequisites.

    Kafka message value format

    value.converter

    The format of the value in the Redpanda topic. The default is SNOWFLAKE_JSON.

    Max Tasks

    tasks.max

    Maximum number of tasks to use for this connector. The default is 1. Each task replicates exclusive set of partitions assigned to it.

    Connector name

    name

    Globally-unique name to use for this connector.

  4. Click Next. Review the connector properties specified, then click Create.

Advanced Snowflake Sink connector configuration

In most instances, the preceding basic configuration properties are sufficient. If you require additional property settings, then specify any of the following optional advanced connector configuration properties by selecting Show advanced options on the Create Connector page:

Property name Property key Description

Snowflake schema name

snowflake.schema.name

The Snowflake database schema name. The default is PUBLIC.

Snowflake ingestion method

snowflake.ingestion.method

The default, SNOWPIPE, allows for structured data, while SNOWPIPE_STREAMING is lower latency option.

Snowflake topic2table map

snowflake.topic2table.map

(Optional) Map of topics to tables. Format is comma-separated tuples. For example, <topic-1>:<table-1>,<topic-2>:<table-2>.

Buffer count records

buffer.count.records

Number of records buffered in memory per partition before triggering Snowflake ingestion. Default is 10000.

Buffer flush time

buffer.flush.time

The time in seconds to flush cached data. Default is 120.

Buffer size bytes

buffer.size.bytes

Cumulative size of records buffered in memory per partition before triggering Snowflake ingestion. Default is 5000000.

Error tolerance

errors.tolerance

Error tolerance response during connector operation. Default value is none and signals that any error will result in an immediate connector task failure. Value of all changes the behavior to skip over problematic records.

Dead letter queue topic name

errors.deadletterqueue.topic.name

The name of the topic to be used as the dead letter queue (DLQ) for messages that result in an error when processed by this sink connector, its transformations, or converters. The topic name is blank by default, which means that no messages are recorded in the DLQ.

Dead letter queue topic replication factor

errors.deadletterqueue.topic .replication.factor

Replication factor used to create the dead letter queue topic when it doesn’t already exist.

Enable error context headers

errors.deadletterqueue.context .headers.enable

When true, adds a header containing error context to the messages written to the dead letter queue. To avoid clashing with headers from the original record, all error context header keys, start with __connect.errors.

Map data

Use the appropriate key or value converter (input data format) for your data as follows:

  • JSON formatted records should use SNOWFLAKE_JSON (com.snowflake.kafka.connector.records.SnowflakeJsonConverter).

  • AVRO formatted records that use Kafka’s Schema Registry Service should use SNOWFLAKE_AVRO (com.snowflake.kafka.connector.records.SnowflakeAvroConverter).

  • AVRO formatted records that contain the schema (and therefore do not need Kafka’s Schema Registry Service) should use SNOWFLAKE_AVRO_WITHOUT_SCHEMA_REGISTRY (com.snowflake.kafka.connector.records.SnowflakeAvroConverterWithoutSchemaRegistry).

  • Plain text formatted records should use STRING (org.apache.kafka.connect.storage.StringConverter).

Test the connection

After the connector is created, verify in your Snowflake worksheet that your table is populated:

SELECT * FROM TEST.PUBLIC.TABLE_NAME;

It may take a couple of minutes for the records to be visible in Snowflake.

Troubleshoot

After submitting the connector for creation in Redpanda Console, the Snowflake Sink connector attempts to authenticate to the Snowflake database to validate the configuration. This validation must be successful before the connector is created. It can take up 10 seconds or more to respond. If the connector fails, check the error message or select Show Logs to view error details.

Additional errors and corrective actions follow.

Message Action

snowflake.url.name is not a valid snowflake url

Check to make sure Snowflake URL name contains a valid Snowflake URL.

snowflake.user.name: Cannot connect to Snowflake

Check to make sure Snowflake user name contains a valid Snowflake user.

snowflake.private.key must be a valid PEM RSA private key / java.lang.IllegalArgumentException: Last encoded character (before the padding, if any) is a valid base 64 alphabet but not a possible value. Expect the discarded bits to be zero.

Snowflake private key is invalid. Provide a valid key.

snowflake.database.name+ database does not exist

Specify a valid database name in snowflake.database.name.

Object does not exist, or operation cannot be performed

Snowflake error that can have several causes: an invalid role is being used, there is no existing Snowflake table, or an incorrect schema name is specified. Verify that the connector configuration and Snowflake settings are valid.

Config:value.converter has provided value:com.snowflake.kafka.connector.records.SnowflakeJsonConverter. If ingestionMethod is:snowpipe_streaming, Snowflake Custom Converters are not allowed.

Use STRING for the Kafka message value format.

Suggested reading