Collapse

gcp_pubsub

Available in: Cloud, Self-Managed

Sends messages to a GCP Cloud Pub/Sub topic. Metadata from messages are sent as attributes.

Common
Advanced

outputs:
  label: ""
  gcp_pubsub:
    project: "" # No default (required)
    credentials_json: ""
    topic: "" # No default (required)
    endpoint: ""
    max_in_flight: 64
    count_threshold: 100
    delay_threshold: 10ms
    byte_threshold: 1000000
    metadata:
      exclude_prefixes: []
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)

outputs:
  label: ""
  gcp_pubsub:
    project: "" # No default (required)
    credentials_json: ""
    topic: "" # No default (required)
    endpoint: ""
    ordering_key: "" # No default (optional)
    max_in_flight: 64
    count_threshold: 100
    delay_threshold: 10ms
    byte_threshold: 1000000
    publish_timeout: 1m0s
    validate_topic: true
    metadata:
      exclude_prefixes: []
    flow_control:
      max_outstanding_bytes: -1
      max_outstanding_messages: 1000
      limit_exceeded_behavior: block
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)

For information on how to set up credentials, see this guide.

Troubleshooting

If you’re consistently seeing Failed to send message to gcp_pubsub: context deadline exceeded error logs without any further information it is possible that you are encountering https://github.com/benthosdev/benthos/issues/1042, which occurs when metadata values contain characters that are not valid utf-8. This can frequently occur when consuming from Kafka as the key metadata field may be populated with an arbitrary binary value, but this issue is not exclusive to Kafka.

If you are blocked by this issue then a work around is to delete either the specific problematic keys:

pipeline:
  processors:
    - mapping: |
        meta kafka_key = deleted()

Or delete all keys with:

pipeline:
  processors:
    - mapping: meta = deleted()

Fields

`batching`

Configures a batching policy on this output. While the PubSub client maintains its own internal buffering mechanism, preparing larger batches of messages can further trade-off some latency for throughput.

Type: object

# Examples:
batching:
  byte_size: 5000
  count: 0
  period: 1s

# ---

batching:
  count: 10
  period: 1s

# ---

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m

`batching.byte_size`

An amount of bytes at which the batch should be flushed. If 0 disables size based batching.

Type: int

Default: 0

`batching.check`

A Bloblang query that should return a boolean value indicating whether a message should end a batch.

Type: string

Default: ""

# Examples:
check: this.type == "end_of_transaction"

`batching.count`

A number of messages at which the batch should be flushed. If 0 disables count based batching.

Type: int

Default: 0

`batching.period`

A period in which an incomplete batch should be flushed regardless of its size.

Type: string

Default: ""

# Examples:
period: 1s

# ---

period: 1m

# ---

period: 500ms

`batching.processors[]`

A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.

Type: processor

# Examples:
processors:
  - archive:
      format: concatenate


# ---

processors:
  - archive:
      format: lines


# ---

processors:
  - archive:
      format: json_array

`byte_threshold`

Publish a batch when its size in bytes reaches this value.

Type: int

Default: 1000000

`count_threshold`

Publish a pubsub buffer when it has this many messages

Type: int

Default: 100

`credentials_json`

Base64-encoded Google Service Account credentials in JSON format (optional). Use this field to authenticate with Google Cloud services. For more information about creating service account credentials, see Google’s service account documentation.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Secrets.

Type: string

Default: ""

`delay_threshold`

Publish a non-empty pubsub buffer after this delay has passed.

Type: string

Default: 10ms

`endpoint`

An optional endpoint to override the default of pubsub.googleapis.com:443. This can be used to connect to a region specific pubsub endpoint. For a list of valid values, see this document.

Type: string

Default: ""

# Examples:
endpoint: us-central1-pubsub.googleapis.com:443

# ---

endpoint: us-west3-pubsub.googleapis.com:443

`flow_control`

For a given topic, configures the PubSub client’s internal buffer for messages to be published.

Type: object

`flow_control.limit_exceeded_behavior`

Configures the behavior when trying to publish additional messages while the flow controller is full. The available options are block (default), ignore (disable), and signal_error (publish results will return an error).

Type: string

Default: block

Options: ignore, block, signal_error

`flow_control.max_outstanding_bytes`

Maximum size of buffered messages to be published. If less than or equal to zero, this is disabled.

Type: int

Default: -1

`flow_control.max_outstanding_messages`

Maximum number of buffered messages to be published. If less than or equal to zero, this is disabled.

Type: int

Default: 1000

`max_in_flight`

The maximum number of messages to have in flight at a given time. Increasing this may improve throughput.

Type: int

Default: 64

`metadata`

Specify criteria for which metadata values are sent as attributes, all are sent by default.

Type: object

`metadata.exclude_prefixes[]`

Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages.

Type: array

Default: []

`ordering_key`

The ordering key to use for publishing messages. This field supports interpolation functions.

Type: string

`project`

The project ID of the topic to publish to.

Type: string

`publish_timeout`

The maximum length of time to wait before abandoning a publish attempt for a message.

Type: string

Default: 1m0s

# Examples:
publish_timeout: 10s

# ---

publish_timeout: 5m

# ---

publish_timeout: 60m

`topic`

The topic to publish to. This field supports interpolation functions.

Type: string

`validate_topic`

Whether to validate the existence of the topic before publishing. If set to false and the topic does not exist, messages will be lost.

Type: bool

Default: true

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution

What do you think of this page?

Let us know more:

Let us contact you about your feedback:

gcp_pubsub

Troubleshooting

Fields

batching

batching.byte_size

batching.check

batching.count

batching.period

batching.processors[]

byte_threshold

count_threshold

credentials_json

delay_threshold

endpoint

flow_control

flow_control.limit_exceeded_behavior

flow_control.max_outstanding_bytes

flow_control.max_outstanding_messages

max_in_flight

metadata

metadata.exclude_prefixes[]

ordering_key

project

publish_timeout

topic

validate_topic

Simple online edits

Contribution guide

`batching`

`batching.byte_size`

`batching.check`

`batching.count`

`batching.period`

`batching.processors[]`

`byte_threshold`

`count_threshold`

`credentials_json`

`delay_threshold`

`endpoint`

`flow_control`

`flow_control.limit_exceeded_behavior`

`flow_control.max_outstanding_bytes`

`flow_control.max_outstanding_messages`

`max_in_flight`

`metadata`

`metadata.exclude_prefixes[]`

`ordering_key`

`project`

`publish_timeout`

`topic`

`validate_topic`