# group_by

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [connect-full.txt](https://docs.redpanda.com/connect-full.txt)

---
title: group_by
latest-connect-version: 4.93.0
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-redpanda-tag: v26.1.9
docname: processors/group_by
page-component-name: connect
page-version: master
page-component-version: master
page-component-title: Connect
page-relative-src-path: processors/group_by.adoc
page-edit-url: https://github.com/redpanda-data/rp-connect-docs/edit/main/modules/components/pages/processors/group_by.adoc
page-git-created-date: "2024-05-24"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/connect/components/processors/group_by.md -->

**Available in:** [Cloud](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/processors/group_by/%20%22View%20the%20Cloud%20version%20of%20this%20component%22), Self-Managed

Splits a [batch of messages](https://docs.redpanda.com/connect/configuration/batching/) into N batches, where each resulting batch contains a group of messages determined by a [Bloblang query](https://docs.redpanda.com/connect/guides/bloblang/about/).

```yml
# Config fields, showing default values
label: ""
group_by: [] # No default (required)
```

Once the groups are established a list of processors are applied to their respective grouped batch, which can be used to label the batch as per their grouping. Messages that do not pass the check of any specified group are placed in their own group.

The functionality of this processor depends on being applied across messages that are batched. You can find out more about batching [in this doc](https://docs.redpanda.com/connect/configuration/batching/).

## [](#fields)Fields

### [](#check)`check`

A [Bloblang query](https://docs.redpanda.com/connect/guides/bloblang/about/) that should return a boolean value indicating whether a message belongs to a given group.

**Type**: `string`

```yaml
# Examples:
check: this.type == "foo"

# ---

check: this.contents.urls.contains("https://benthos.dev/")

# ---

check: true
```

### [](#processors)`processors[]`

A list of [processors](https://docs.redpanda.com/connect/components/processors/about/) to execute on the newly formed group.

**Type**: `processor`

**Default**: `[]`

## [](#examples)Examples

### [](#grouped-processing)Grouped Processing

Imagine we have a batch of messages that we wish to split into a group of foos and everything else, which should be sent to different output destinations based on those groupings. We also need to send the foos as a tar gzip archive. For this purpose we can use the `group_by` processor with a [`switch`](https://docs.redpanda.com/connect/components/outputs/switch/) output:

```yaml
pipeline:
  processors:
    - group_by:
      - check: content().contains("this is a foo")
        processors:
          - archive:
              format: tar
          - compress:
              algorithm: gzip
          - mapping: 'meta grouping = "foo"'

output:
  switch:
    cases:
      - check: meta("grouping") == "foo"
        output:
          gcp_pubsub:
            project: foo_prod
            topic: only_the_foos
      - output:
          gcp_pubsub:
            project: somewhere_else
            topic: no_foos_here
```