Skip to main content
Version: 23.1

Create a GCS Sink Connector

The Google Cloud Storage (GCS) Sink connector stores Redpanda messages in a Google Cloud Storage bucket.


Before you can create a GCS Sink connector in the Redpanda Cloud, you must:

  1. Create a Google Cloud account.

  2. Create a service account that will be used to connect to the GCS service.

  3. Create a service account key and download it.

  4. Create a custom role, which must have the following permissions:

    • storage.objects.create to create items in the GCS bucket
    • storage.objects.delete to overwrite items in the GCS bucket
  5. Create a GCS bucket to which to send data.

  6. Grant permissions to the bucket your created for your service account. Use the role created in step 4.


The GCS Sink connector has the following limitations:

  • You can use only the STRING and BYTES input formats for CSV output format.
  • You can use only the PARQUET format when your messages contain schema.

Create a GCS Sink connector

To create the GCS Sink connector:

  1. In Redpanda Cloud, click Connectors in the navigation menu, and then click Create Connector.

  2. Select Export to Google Cloud Storage.

  3. On the Create Connector page, specify the following required connector configuration options:

    Topics to exportComma-separated list of the cluster topics you want to replicate to GCS.
    Topics regexJava regular expression of topics to replicate. For example: specify .* to replicate all available topics in the cluster. Applicable only when Use regular expressions is selected.
    GCS Credentials JSONJSON object with GCS credentials.
    GCS bucket nameName of an existing GCS bucket to store output files in.
    Kafka message key formatFormat of the key in the Kafka topic. Use BYTES for no conversion.
    Kafka message value formatFormat of the value in the Kafka topic. Use BYTES for no conversion.
    GCS file formatFormat of the files created in GCS: CSV (the default), JSON, JSONL AVRO, or PARQUET. You can use the CSV format output only with BYTES and STRING.
    Avro codecThe Avro compression codec to be used for Avro output files. Available values: null (the default), deflate, snappy, and bzip2.
    Max TasksMaximum number of tasks to use for this connector. The default is 1. Each task replicates exclusive set of partitions assigned to it.
    Connector nameGlobally-unique name to use for this connector.
  4. Click Next. Review the connector properties specified, then click Create.

Advanced GCS Sink connector configuration

In most instances, the preceding basic configuration properties are sufficient. If you require any additional property settings, then specify any of the following optional advanced connector configuration properties by selecting Show advanced options on the Create Connector page:

File name templateThe template for file names on GCS. Supports {{ variable }} placeholders for substituting variables. Supported placeholders are:
  • topic
  • partition
  • start_offset (the offset of the first record in the file)
  • timestamp
  • key (when used, other placeholders are not substituted)
File name prefixThe prefix to be added to the name of each file put in GCS.
Output fieldsFields to place into output files. Supported values are: 'key', 'value', 'offset', 'timestamp', and 'headers'.
Value field encodingThe type of encoding to be used for the value field. Supported values are: 'none' and 'base64'.
Output file compressionThe compression type to be used for files put into GCS. Supported values are: 'none', 'gzip', 'snappy', and 'zstd'.
Max records per fileThe maximum number of records to put in a single file. Must be a non-negative number. 0 is interpreted as "unlimited", which is the default. In this case files are only flushed after
File flush interval millisecondsThe time interval to periodically flush files and commit offsets. Value specified must be a non-negative number. Default is 60 seconds. 0 indicates that it is disabled. In this case, files are only flushed after reaching file.max.records record size.
GCS bucket checkIf set to true, the connector will attempt to put a test file to the GCS bucket to validate access. Default is true.
GCS retry backoff initial delay millisecondsInitial retry delay in milliseconds. The default value is 1000.
GCS retry backoff max delay millisecondsMaximum retry delay in milliseconds. The default value is 32000.
GCS retry backoff delay multiplierRetry delay multiplier. The default value is 2.0.
GCS retry backoff max attemptsRetry max attempts. The default value is 6.
GCS retry backoff total timeout millisecondsRetry total timeout in milliseconds. The default value is 50000.
Retry back-offRetry backoff in milliseconds. In case of transient exceptions, useful for performing recovery. Maximum value is 86400000 (24 hours).
Error toleranceError tolerance response during connector operation. Default value is none and signals that any error will result in an immediate connector task failure. Value of all changes the behavior to skip over problematic records.

Map data

Use the appropriate key or value converter (input data format) for your data as follows:

  • JSON when your messages are JSON-encoded. Select Message JSON contains schema, with the schema and payload fields.
  • AVRO when your messages contain AVRO-encoded messages, with schema stored in the Schema Registry.
  • STRING when your messages contain textual data.
  • BYTES when your messages contain arbitrary data.

You can also select the output data format for your GCS files as follows:

  • CSV to produce data in the CSV format. For CSV only, you can set STRING and BYTES input formats.
  • JSON to produce data in the JSON format as an array of record objects.
  • JSONL to produce data in the JSON format, each message as a separate JSON, one per line.
  • PARQUET to produce data in the PARQUET format when your messages contain schema.
  • AVRO to produce data in the AVRO format when your messages contain schema.

Test the connection

After the connector is created, check the GCS bucket for a new file. Files should appear after the file flush interval (default is 60 seconds).


If there are any connection issues, an error message is returned. Depending on the GCS bucket check property value, the error results in a failed connector (GCS bucket check = true) or a failed task (GCS bucket check = false).

Additional errors and corrective actions follow.

Failed to read credentials from JSON stringThe credentials given as JSON file in the GCS credentials JSON property are incorrect. Copy a valid key from the Google Cloud service account.
The specified bucket does not existCreate the bucket if the bucket does not exist, or correct the bucket name if the bucket exists, but the specified GCS bucket name value is incorrect.
No files in the GCS bucketBe sure to wait until the connector performs the first file flush (default is 60 seconds).

What do you like about this doc?

Optional: Share your email address if we can contact you about your feedback.

Let us know what we do well: