# aws_s3

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: aws_s3
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: connect/components/outputs/aws_s3
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: connect/components/outputs/aws_s3.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/develop/pages/connect/components/outputs/aws_s3.adoc
page-git-created-date: "2024-09-09"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/develop/connect/components/outputs/aws_s3.md -->

**Type:** Output ▼

[Output](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/outputs/aws_s3/)[Cache](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/caches/aws_s3/)[Input](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/inputs/aws_s3/)

**Available in:** Cloud, [Self-Managed](https://docs.redpanda.com/connect/components/outputs/aws_s3/%20%22View%20the%20Self-Managed%20version%20of%20this%20component%22)

Uploads messages to an Amazon S3 bucket as objects, using the path specified in the `path` field.

#### Common

```yml
outputs:
  label: ""
  aws_s3:
    bucket: "" # No default (required)
    path: ${!counter()}-${!timestamp_unix_nano()}.txt
    tags: {}
    content_type: application/octet-stream
    metadata:
      exclude_prefixes: []
    max_in_flight: 64
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
```

#### Advanced

```yml
outputs:
  label: ""
  aws_s3:
    bucket: "" # No default (required)
    path: ${!counter()}-${!timestamp_unix_nano()}.txt
    tags: {}
    content_type: application/octet-stream
    content_encoding: ""
    cache_control: ""
    content_disposition: ""
    content_language: ""
    website_redirect_location: ""
    metadata:
      exclude_prefixes: []
    storage_class: STANDARD
    kms_key_id: ""
    checksum_algorithm: ""
    server_side_encryption: ""
    force_path_style_urls: false
    max_in_flight: 64
    timeout: 5s
    object_canned_acl: ""
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
    region: "" # No default (optional)
    endpoint: "" # No default (optional)
    tcp:
      connect_timeout: 0s
      keep_alive:
        idle: 15s
        interval: 15s
        count: 9
      tcp_user_timeout: 0s
    credentials:
      profile: "" # No default (optional)
      id: "" # No default (optional)
      secret: "" # No default (optional)
      token: "" # No default (optional)
      from_ec2_role: "" # No default (optional)
      role: "" # No default (optional)
      role_external_id: "" # No default (optional)
```

To use a different path for each object, use [function interpolation](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries), which is evaluated for each message in a batch.

## [](#metadata)Metadata

Redpanda Connect sends metadata fields as headers. To mutate or remove these values, see the [metadata docs](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/metadata/).

## [](#tags)Tags

The `tags` field accepts key/value pairs to attach to objects as tags, and the values support [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries):

```yaml
output:
  aws_s3:
    bucket: TODO
    path: ${!counter()}-${!timestamp_unix_nano()}.tar.gz
    tags:
      Key1: Value1
      Timestamp: ${!meta("Timestamp")}
```

## [](#credentials)Credentials

By default, Redpanda Connect uses a shared credentials file when connecting to AWS services. You can also set credentials explicitly at the component level to transfer data across accounts. You can find out more in [AWS credentials](https://docs.redpanda.com/cloud-data-platform/develop/connect/guides/cloud/aws/).

## [](#batching)Batching

It’s common to want to upload messages to S3 as batched archives. The easiest way to do this is to batch your messages at the output level and join the batch of messages with an [`archive`](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/processors/archive/) or [`compress`](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/processors/compress/) processor.

For example, the following configuration uploads messages as a `.tar.gz` archive of documents:

```yaml
output:
  aws_s3:
    bucket: TODO
    path: ${!counter()}-${!timestamp_unix_nano()}.tar.gz
    batching:
      count: 100
      period: 10s
      processors:
        - archive:
            format: tar
        - compress:
            algorithm: gzip
```

This configuration uploads JSON documents as a single large document containing an array of objects:

```yaml
output:
  aws_s3:
    bucket: TODO
    path: ${!counter()}-${!timestamp_unix_nano()}.json
    batching:
      count: 100
      processors:
        - archive:
            format: json_array
```

## [](#bucket-name-format)Bucket name format

The `bucket` field accepts a bucket name only, not an ARN. For example, use `my-bucket`, not `arn:aws:s3:::my-bucket`.

## [](#s3-compatible-storage)S3-compatible storage

The `endpoint` and `force_path_style_urls` fields let you connect to S3-compatible storage services such as Cloudflare R2, MinIO, or DigitalOcean Spaces.

For Cloudflare R2, set `endpoint` to your account endpoint URL and enable `force_path_style_urls`:

```yaml
output:
  aws_s3:
    bucket: r2-bucket
    path: ${!uuid_v4()}.json
    endpoint: https://<account-id>.r2.cloudflarestorage.com
    force_path_style_urls: true
    region: auto
    credentials:
      id: <r2-access-key-id>
      secret: <r2-secret-access-key>
```

Find your account ID in the Cloudflare dashboard under **R2 > Overview > Account Details**. Generate API credentials under **R2 > Manage R2 API Tokens**.

## [](#performance)Performance

This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field `max_in_flight`.

## [](#fields)Fields

### [](#batching-2)`batching`

Configure a [batching policy](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/batching/).

**Type**: `object`

```yaml
# Examples:
batching:
  byte_size: 5000
  count: 0
  period: 1s

# ---

batching:
  count: 10
  period: 1s

# ---

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
```

### [](#batching-byte_size)`batching.byte_size`

The number of bytes at which the batch is flushed. Set to `0` to disable size-based batching.

**Type**: `int`

**Default**: `0`

### [](#batching-check)`batching.check`

A [Bloblang query](https://docs.redpanda.com/cloud-data-platform/develop/connect/guides/bloblang/about/) that should return a boolean value indicating whether a message should end a batch.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
check: this.type == "end_of_transaction"
```

### [](#batching-count)`batching.count`

The number of messages after which the batch is flushed. Set to `0` to disable count-based batching.

**Type**: `int`

**Default**: `0`

### [](#batching-period)`batching.period`

A period in which an incomplete batch should be flushed regardless of its size.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
period: 1s

# ---

period: 1m

# ---

period: 500ms
```

### [](#batching-processors)`batching.processors[]`

A list of [processors](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/processors/about/) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.

**Type**: `processor`

```yaml
# Examples:
processors:
  - archive:
      format: concatenate

# ---

processors:
  - archive:
      format: lines

# ---

processors:
  - archive:
      format: json_array
```

### [](#bucket)`bucket`

The bucket to upload messages to.

**Type**: `string`

### [](#cache_control)`cache_control`

The cache control to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `""`

### [](#checksum_algorithm)`checksum_algorithm`

The algorithm used to validate each object during its upload to the Amazon S3 bucket.

**Type**: `string`

**Default**: `""`

**Options**: `CRC32`, `CRC32C`, `SHA1`, `SHA256`

### [](#content_disposition)`content_disposition`

The content disposition to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `""`

### [](#content_encoding)`content_encoding`

An optional content encoding to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `""`

### [](#content_language)`content_language`

The content language to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `""`

### [](#content_type)`content_type`

The content type to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `application/octet-stream`

### [](#credentials-2)`credentials`

Optional manual configuration of AWS credentials to use. More information can be found in [Amazon Web Services](https://docs.redpanda.com/cloud-data-platform/develop/connect/guides/cloud/aws/).

**Type**: `object`

### [](#credentials-from_ec2_role)`credentials.from_ec2_role`

Use the credentials of a host EC2 machine configured to assume [an IAM role associated with the instance](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html).

**Type**: `bool`

### [](#credentials-id)`credentials.id`

The ID of credentials to use.

**Type**: `string`

### [](#credentials-profile)`credentials.profile`

A profile from `~/.aws/credentials` to use.

**Type**: `string`

### [](#credentials-role)`credentials.role`

A role ARN to assume.

**Type**: `string`

### [](#credentials-role_external_id)`credentials.role_external_id`

An external ID to provide when assuming a role.

**Type**: `string`

### [](#credentials-secret)`credentials.secret`

The secret for the credentials being used.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

### [](#credentials-token)`credentials.token`

The token for the credentials being used, required when using short term credentials.

**Type**: `string`

### [](#endpoint)`endpoint`

Allows you to specify a custom endpoint for the AWS API.

**Type**: `string`

### [](#force_path_style_urls)`force_path_style_urls`

Forces the client API to use path style URLs, which helps when connecting to custom endpoints.

**Type**: `bool`

**Default**: `false`

### [](#kms_key_id)`kms_key_id`

An optional server-side encryption key.

**Type**: `string`

**Default**: `""`

### [](#max_in_flight)`max_in_flight`

The maximum number of messages to have in flight at a given time. Increase this to improve throughput.

**Type**: `int`

**Default**: `64`

### [](#metadata-2)`metadata`

Specify criteria for which metadata values are attached to objects as headers.

**Type**: `object`

### [](#metadata-exclude_prefixes)`metadata.exclude_prefixes[]`

Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages.

**Type**: `array`

**Default**: `[]`

### [](#object_canned_acl)`object_canned_acl`

The object canned ACL value. Leave empty to omit the ACL from upload requests, which is required for buckets that have ACLs disabled (the AWS default since 2023).

**Type**: `string`

**Default**: `""`

**Options**: `` `, `private ``, `public-read`, `public-read-write`, `authenticated-read`, `aws-exec-read`, `bucket-owner-read`, `bucket-owner-full-control`

### [](#path)`path`

The path of each message to upload. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `${!counter()}-${!timestamp_unix_nano()}.txt`

```yaml
# Examples:
path: ${!counter()}-${!timestamp_unix_nano()}.txt

# ---

path: ${!meta("kafka_key")}.json

# ---

path: ${!json("doc.namespace")}/${!json("doc.id")}.json
```

### [](#region)`region`

The AWS region to target.

**Type**: `string`

### [](#server_side_encryption)`server_side_encryption`

An optional server-side encryption algorithm.

**Type**: `string`

**Default**: `""`

### [](#storage_class)`storage_class`

The storage class to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `STANDARD`

**Options**: `STANDARD`, `REDUCED_REDUNDANCY`, `GLACIER`, `STANDARD_IA`, `ONEZONE_IA`, `INTELLIGENT_TIERING`, `DEEP_ARCHIVE`

### [](#tags-2)`tags`

Key/value pairs to store with the object as tags. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `{}`

```yaml
# Examples:
tags:
  Key1: Value1
  Timestamp: ${!meta("Timestamp")}
```

### [](#tcp)`tcp`

Configure TCP socket-level settings to optimize network performance and reliability. These low-level controls are useful for:

-   **High-latency networks**: Increase `connect_timeout` to allow more time for connection establishment

-   **Long-lived connections**: Configure `keep_alive` settings to detect and recover from stale connections

-   **Unstable networks**: Tune keep-alive probes to balance between quick failure detection and avoiding false positives

-   **Linux systems with specific requirements**: Use `tcp_user_timeout` (Linux 2.6.37+) to control data acknowledgment timeouts


Most users should keep the default values. Only modify these settings if you’re experiencing connection stability issues or have specific network requirements.

**Type**: `object`

### [](#tcp-connect_timeout)`tcp.connect_timeout`

Maximum amount of time a dial will wait for a connect to complete. Zero disables.

**Type**: `string`

**Default**: `0s`

### [](#tcp-keep_alive)`tcp.keep_alive`

TCP keep-alive probe configuration.

**Type**: `object`

### [](#tcp-keep_alive-count)`tcp.keep_alive.count`

Maximum unanswered keep-alive probes before dropping the connection. Zero defaults to 9.

**Type**: `int`

**Default**: `9`

### [](#tcp-keep_alive-idle)`tcp.keep_alive.idle`

Duration the connection must be idle before sending the first keep-alive probe. Zero defaults to 15s. Negative values disable keep-alive probes.

**Type**: `string`

**Default**: `15s`

### [](#tcp-keep_alive-interval)`tcp.keep_alive.interval`

Duration between keep-alive probes. Zero defaults to 15s.

**Type**: `string`

**Default**: `15s`

### [](#tcp-tcp_user_timeout)`tcp.tcp_user_timeout`

Maximum time to wait for acknowledgment of transmitted data before killing the connection. Linux-only (kernel 2.6.37+), ignored on other platforms. When enabled, keep\_alive.idle must be greater than this value per RFC 5482. Zero disables.

**Type**: `string`

**Default**: `0s`

### [](#timeout)`timeout`

The maximum period to wait on an upload before abandoning it and reattempting.

**Type**: `string`

**Default**: `5s`

### [](#website_redirect_location)`website_redirect_location`

The website redirect location to set for each object. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `""`