gcp_cloud_storage

Type:

Downloads objects within a Google Cloud Storage bucket, optionally filtered by a prefix.

Introduced in version 3.43.0.

  • Common

  • Advanced

# Common config fields, showing default values
input:
  label: ""
  gcp_cloud_storage:
    bucket: "" # No default (required)
    prefix: ""
    scanner:
      to_the_end: {}
# All config fields, showing default values
input:
  label: ""
  gcp_cloud_storage:
    bucket: "" # No default (required)
    prefix: ""
    scanner:
      to_the_end: {}
    delete_objects: false

Metadata

This input adds the following metadata fields to each message:

- gcs_key
- gcs_bucket
- gcs_last_modified
- gcs_last_modified_unix
- gcs_content_type
- gcs_content_encoding
- All user defined metadata

You can access these metadata fields using function interpolation.

Credentials

By default Redpanda Connect will use a shared credentials file when connecting to GCP services. You can find out more in Google Cloud Platform.

Fields

bucket

The name of the bucket from which to download objects.

Type: string

prefix

An optional path prefix, if set only objects with the prefix are consumed.

Type: string

Default: ""

scanner

The scanner by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the csv scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once.

Type: scanner

Default: {"to_the_end":{}} Requires version 4.25.0 or newer

delete_objects

Whether to delete downloaded objects from the bucket once they are processed.

Type: bool

Default: false