# archive

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: archive
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: connect/components/processors/archive
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: connect/components/processors/archive.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/develop/pages/connect/components/processors/archive.adoc
page-git-created-date: "2024-09-09"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/develop/connect/components/processors/archive.md -->

**Available in:** Cloud, [Self-Managed](https://docs.redpanda.com/connect/components/processors/archive/%20%22View%20the%20Self-Managed%20version%20of%20this%20component%22)

Archives all the messages of a batch into a single message according to the selected archive format.

```yml
# Config fields, showing default values
label: ""
archive:
  format: "" # No default (required)
  path: ""
```

Some archive formats (such as tar, zip) treat each archive item (message part) as a file with a path. Since message parts only contain raw data a unique path must be generated for each part. This can be done by using function interpolations on the 'path' field as described in [Bloblang queries](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries). For types that aren’t file based (such as binary) the file field is ignored.

The resulting archived message adopts the metadata of the _first_ message part of the batch.

The functionality of this processor depends on being applied across messages that are batched. You can find out more about batching [in this doc](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/batching/).

## [](#fields)Fields

### [](#format)`format`

The archiving format to apply.

**Type**: `string`

| Option | Summary |
| --- | --- |
| binary | Archive messages to a binary blob format. |
| concatenate | Join the raw contents of each message into a single binary message. |
| json_array | Attempt to parse each message as a JSON document and append the result to an array, which becomes the contents of the resulting message. |
| lines | Join the raw contents of each message and insert a line break between each one. |
| tar | Archive messages to a unix standard tape archive. |
| zip | Archive messages to a zip file. |

### [](#path)`path`

The path to set for each message in the archive (when applicable). This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
path: ${!count("files")}-${!timestamp_unix_nano()}.txt

# ---

path: ${!meta("kafka_key")}-${!json("id")}.json
```

## [](#examples)Examples

### [](#tar-archive)Tar Archive

If we had JSON messages in a batch each of the form:

```json
{"doc":{"id":"foo","body":"hello world 1"}}
```

And we wished to tar archive them, setting their filenames to their respective unique IDs (with the extension `.json`), our config might look like this:

```yaml
pipeline:
  processors:
    - archive:
        format: tar
        path: ${!json("doc.id")}.json
```