# git

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: git
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: connect/components/inputs/git
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: connect/components/inputs/git.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/develop/pages/connect/components/inputs/git.adoc
page-git-created-date: "2025-05-02"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/develop/connect/components/inputs/git.md -->

**Available in:** Cloud, [Self-Managed](https://docs.redpanda.com/connect/components/inputs/git/%20%22View%20the%20Self-Managed%20version%20of%20this%20component%22)

Clones a Git repository, reads its contents, then polls for new commits at a configurable interval. Any updates are emitted as new messages.

```yml
inputs:
  label: ""
  git:
    repository_url: "" # No default (required)
    branch: main
    poll_interval: 10s
    include_patterns: []
    exclude_patterns: []
    max_file_size: 10485760
    checkpoint_cache: "" # No default (optional)
    checkpoint_key: git_last_commit
    auth:
      basic:
        username: ""
        password: ""
      ssh_key:
        private_key_path: ""
        private_key: ""
        passphrase: ""
      token:
        value: ""
    auto_replay_nacks: true
```

## [](#metadata)Metadata

This input adds the following metadata fields to each message:

-   `git_file_path`

-   `git_file_size`

-   `git_file_mode`

-   `git_file_modified`

-   `git_commit`

-   `git_mime_type`

-   `git_is_binary`

-   `git_deleted` (when a source file is deleted)


You can access these metadata fields using [function interpolation](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

## [](#fields)Fields

### [](#auth)`auth`

Options for authenticating with your Git repository.

**Type**: `object`

### [](#auth-basic)`auth.basic`

Allows you to specify basic authentication.

**Type**: `object`

### [](#auth-basic-password)`auth.basic.password`

A password to authenticate with.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#auth-basic-username)`auth.basic.username`

The username to use for authentication.

**Type**: `string`

**Default**: `""`

### [](#auth-ssh_key)`auth.ssh_key`

Allows you to specify SSH key authentication.

**Type**: `object`

### [](#auth-ssh_key-passphrase)`auth.ssh_key.passphrase`

The passphrase for your SSH private key.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#auth-ssh_key-private_key)`auth.ssh_key.private_key`

Your private SSH key. When using encrypted keys, you must also set a value for [`private_key_passphrase`](#auth-ssh_key-passphrase).

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#auth-ssh_key-private_key_path)`auth.ssh_key.private_key_path`

The path to your private SSH key file. When using encrypted keys, you must also set a value for [`private_key_passphrase`](#auth-ssh_key-passphrase).

**Type**: `string`

**Default**: `""`

### [](#auth-token)`auth.token`

Allows you to specify token-based authentication.

**Type**: `object`

### [](#auth-token-value)`auth.token.value`

The token value to use for token-based authentication.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#auto_replay_nacks)`auto_replay_nacks`

Whether to automatically replay messages that are rejected (nacked) at the output level. If the cause of rejections is persistent, leaving this option enabled can result in back pressure.

Set `auto_replay_nacks` to `false` to delete rejected messages. Disabling auto replays can greatly improve memory efficiency of high throughput streams, as the original shape of the data is discarded immediately upon consumption and mutation.

**Type**: `bool`

**Default**: `true`

### [](#branch)`branch`

The repository branch to check out.

**Type**: `string`

**Default**: `main`

### [](#checkpoint_cache)`checkpoint_cache`

Specify a [`cache`](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/caches/about/) resource to store the last processed commit hash. After a restart, Redpanda Connect can then continue processing changes from where it left off, avoiding the need to reprocess all detected updates.

**Type**: `string`

### [](#checkpoint_key)`checkpoint_key`

The key to use when storing the last processed commit hash in the cache.

**Type**: `string`

**Default**: `git_last_commit`

### [](#exclude_patterns)`exclude_patterns[]`

A list of file patterns to exclude. For example, you could choose not to read content from certain Git directories or image files: `'.git/**', '**/*.png'`. These patterns take precedence over `include_patterns`.

The following patterns are supported:

-   Glob patterns: **, `/`**`*/`, `?`

-   Character ranges: `[a-z]`. Escape any character with a special meaning using a backslash.


**Type**: `array`

**Default**: `[]`

### [](#include_patterns)`include_patterns[]`

A list of file patterns to read from. For example, you could read content from only Markdown and YAML files: `'***/**.md', 'configs/*.yaml'`.

The following patterns are supported:

-   Glob patterns: **, `/`**`*/`, `?`

-   Character ranges: `[a-z]`. Escape any character with a special meaning using a backslash.


If this field is left empty, all files are read from.

**Type**: `array`

**Default**: `[]`

### [](#max_file_size)`max_file_size`

The maximum size of files to read from (in bytes). Files that exceed this limit are skipped. Set to `0` for unlimited file sizes.

**Type**: `int`

**Default**: `10485760`

### [](#poll_interval)`poll_interval`

How frequently this input polls the Git repository for changes.

**Type**: `string`

**Default**: `10s`

```yaml
# Examples:
poll_interval: 10s
```

### [](#repository_url)`repository_url`

The URL of the Git repository to clone.

**Type**: `string`

```yaml
# Examples:
repository_url: https://github.com/username/repo.git
```