aws_cloudwatch_logs

Consumes log events from AWS CloudWatch Logs.

  • Common

  • Advanced

inputs:
  label: ""
  aws_cloudwatch_logs:
    log_group_name: "" # No default (required)
    log_stream_names: [] # No default (optional)
    log_stream_prefix: "" # No default (optional)
    filter_pattern: "" # No default (optional)
    start_time: "" # No default (optional)
    poll_interval: 5s
    auto_replay_nacks: true
inputs:
  label: ""
  aws_cloudwatch_logs:
    log_group_name: "" # No default (required)
    log_stream_names: [] # No default (optional)
    log_stream_prefix: "" # No default (optional)
    filter_pattern: "" # No default (optional)
    start_time: "" # No default (optional)
    poll_interval: 5s
    limit: 1000
    structured_log: true
    api_timeout: 30s
    auto_replay_nacks: true
    region: "" # No default (optional)
    endpoint: "" # No default (optional)
    tcp:
      connect_timeout: 0s
      keep_alive:
        idle: 15s
        interval: 15s
        count: 9
      tcp_user_timeout: 0s
    credentials:
      profile: "" # No default (optional)
      id: "" # No default (optional)
      secret: "" # No default (optional)
      token: "" # No default (optional)
      from_ec2_role: false
      role: "" # No default (optional)
      role_external_id: "" # No default (optional)

Polls CloudWatch Log Groups for log events. Supports filtering by log streams, CloudWatch filter patterns, and configurable start times.

Each log event becomes a separate message with metadata including the log group name, log stream name, timestamp, and ingestion time.

This input provides at-least-once delivery. It tracks its position in memory only, so if the process restarts, it resumes from the configured start_time (or the beginning if not set). Duplicates can occur across restarts. For exactly-once outcomes, implement idempotent or deduplicated downstream processing.

Credentials

By default, Redpanda Connect uses a shared credentials file when connecting to AWS services. You can also set credentials explicitly at the component level to transfer data across accounts. You can find out more in AWS credentials.

Metadata

This input adds the following metadata fields to each message:

  • cloudwatch_log_group: The name of the log group.

  • cloudwatch_log_stream: The name of the log stream.

  • cloudwatch_timestamp: The timestamp of the log event (Unix milliseconds).

  • cloudwatch_ingestion_time: The ingestion timestamp (Unix milliseconds).

  • cloudwatch_event_id: The unique event ID.

You can access these metadata fields using function interpolation.

Fields

api_timeout

The maximum time to wait for an API request to complete.

Type: string

Default: 30s

auto_replay_nacks

Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation.

Type: bool

Default: true

credentials

Optional manual configuration of AWS credentials to use. More information can be found in Amazon Web Services.

Type: object

credentials.from_ec2_role

Use the credentials of a host EC2 machine configured to assume an IAM role associated with the instance.

Type: bool

credentials.id

The ID of credentials to use.

Type: string

credentials.profile

A profile from ~/.aws/credentials to use.

Type: string

credentials.role

A role ARN to assume.

Type: string

credentials.role_external_id

An external ID to provide when assuming a role.

Type: string

credentials.secret

The secret for the credentials being used.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration.

Type: string

credentials.token

The token for the credentials being used, required when using short term credentials.

Type: string

endpoint

Allows you to specify a custom endpoint for the AWS API.

Type: string

filter_pattern

An optional CloudWatch Logs filter pattern to apply when querying log events. See AWS documentation for filter pattern syntax.

Type: string

# Examples:
filter_pattern: "[ERROR]"

limit

The maximum number of log events to return in a single API call. Valid range: 1-10000.

Type: int

Default: 1000

log_group_name

The name of the CloudWatch Log Group to consume from.

Type: string

# Examples:
log_group_name: my-app-logs

log_stream_names[]

An optional list of log stream names to consume from. If not set, events from all streams in the log group will be consumed.

Type: array

# Examples:
log_stream_names:
  - stream-1
  - stream-2

log_stream_prefix

An optional log stream name prefix to filter streams. Only streams starting with this prefix will be consumed.

Type: string

# Examples:
log_stream_prefix: prod-

poll_interval

The interval at which to poll for new log events.

Type: string

Default: 5s

region

The AWS region to target.

Type: string

start_time

The time to start consuming log events from. Can be an RFC3339 timestamp (e.g., 2024-01-01T00:00:00Z) or the string now to start consuming from the current time. If not set, starts from the beginning of available logs.

Type: string

# Examples:
start_time: 2024-01-01T00:00:00Z

# ---

start_time: now

structured_log

Whether to output log events as structured JSON objects with all metadata fields, or as plain text messages with metadata in message metadata.

Type: bool

Default: true

tcp

TCP socket configuration.

Type: object

tcp.connect_timeout

Maximum amount of time a dial will wait for a connect to complete. Zero disables.

Type: string

Default: 0s

tcp.keep_alive

TCP keep-alive probe configuration.

Type: object

tcp.keep_alive.count

Maximum unanswered keep-alive probes before dropping the connection. Zero defaults to 9.

Type: int

Default: 9

tcp.keep_alive.idle

Duration the connection must be idle before sending the first keep-alive probe. Zero defaults to 15s. Negative values disable keep-alive probes.

Type: string

Default: 15s

tcp.keep_alive.interval

Duration between keep-alive probes. Zero defaults to 15s.

Type: string

Default: 15s

tcp.tcp_user_timeout

Maximum time to wait for acknowledgment of transmitted data before killing the connection. Linux-only (kernel 2.6.37+), ignored on other platforms. When enabled, keep_alive.idle must be greater than this value per RFC 5482. Zero disables.

Type: string

Default: 0s