# microsoft_sql_server_cdc

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [connect-full.txt](https://docs.redpanda.com/connect-full.txt)

---
title: microsoft_sql_server_cdc
latest-connect-version: 4.93.0
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-redpanda-tag: v26.1.9
docname: inputs/microsoft_sql_server_cdc
page-component-name: connect
page-version: master
page-component-version: master
page-component-title: Connect
page-relative-src-path: inputs/microsoft_sql_server_cdc.adoc
page-edit-url: https://github.com/redpanda-data/rp-connect-docs/edit/main/modules/components/pages/inputs/microsoft_sql_server_cdc.adoc
description: Enables Change Data Capture by consuming from Microsoft SQL Server's change tables.
page-git-created-date: "2025-10-24"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/connect/components/inputs/microsoft_sql_server_cdc.md -->

**Available in:** [Cloud](https://docs.redpanda.com/cloud-data-platform/develop/connect/components/inputs/microsoft_sql_server_cdc/%20%22View%20the%20Cloud%20version%20of%20this%20component%22), Self-Managed

**License**: This component requires an [enterprise license](https://docs.redpanda.com/redpanda-connect/get-started/licensing/). You can either [upgrade to an Enterprise Edition license](https://www.redpanda.com/upgrade), or [generate a trial license key](http://redpanda.com/try-enterprise) that's valid for 30 days.

Enables Change Data Capture by consuming from Microsoft SQL Server’s change tables.

Introduced in version 4.67.5.

#### Common

```yaml
inputs:
  label: ""
  microsoft_sql_server_cdc:
    connection_string: "" # No default (required)
    stream_snapshot: false
    max_parallel_snapshot_tables: 1
    snapshot_max_batch_size: 1000
    include: [] # No default (required)
    exclude: [] # No default (optional)
    checkpoint_cache: "" # No default (optional)
    checkpoint_cache_table_name: rpcn.CdcCheckpointCache
    checkpoint_cache_connection_string: "" # No default (optional)
    checkpoint_cache_key: microsoft_sql_server_cdc
    checkpoint_limit: 1024
    stream_backoff_interval: 5s
    auto_replay_nacks: true
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
```

#### Advanced

```yaml
inputs:
  label: ""
  microsoft_sql_server_cdc:
    connection_string: "" # No default (required)
    stream_snapshot: false
    max_parallel_snapshot_tables: 1
    snapshot_max_batch_size: 1000
    include: [] # No default (required)
    exclude: [] # No default (optional)
    checkpoint_cache: "" # No default (optional)
    checkpoint_cache_table_name: rpcn.CdcCheckpointCache
    checkpoint_cache_connection_string: "" # No default (optional)
    checkpoint_cache_key: microsoft_sql_server_cdc
    checkpoint_limit: 1024
    stream_backoff_interval: 5s
    auto_replay_nacks: true
    batching:
      count: 0
      byte_size: 0
      period: ""
      check: ""
      processors: [] # No default (optional)
```

Streams changes from a Microsoft SQL Server database for Change Data Capture (CDC). Additionally, if `stream_snapshot` is set to true, then the existing data in the database is also streamed too.

## [](#metadata)Metadata

This input adds the following metadata fields to each message:

-   schema (Schema of the table that the message originated from)

-   table (Name of the table that the message originated from)

-   operation (Type of operation that generated the message: "read", "delete", "insert", or "update\_before" and "update\_after". "read" is from messages that are read in the initial snapshot phase.)

-   lsn (the Log Sequence Number in Microsoft SQL Server)


## [](#permissions)Permissions

To use the default Microsoft SQL Server cache, the user must have permissions to create tables and stored procedures. Refer to [`checkpoint_cache_table_name`](#checkpoint_cache_table_name) for additional details.

## [](#fields)Fields

### [](#auto_replay_nacks)`auto_replay_nacks`

Whether to automatically replay messages that are rejected (nacked) at the output level. If the cause of rejections is persistent, leaving this option enabled can result in back pressure.

Set `auto_replay_nacks` to `false` to delete rejected messages. Disabling auto replays can greatly improve memory efficiency of high throughput streams, as the original shape of the data is discarded immediately upon consumption and mutation.

**Type**: `bool`

**Default**: `true`

### [](#batching)`batching`

Configure a [batching policy](https://docs.redpanda.com/connect/configuration/batching/).

**Type**: `object`

```yaml
# Examples:
batching:
  byte_size: 5000
  count: 0
  period: 1s

# ---

batching:
  count: 10
  period: 1s

# ---

batching:
  check: this.contains("END BATCH")
  count: 0
  period: 1m
```

### [](#batching-byte_size)`batching.byte_size`

The number of bytes at which the batch is flushed. Set to `0` to disable size-based batching.

**Type**: `int`

**Default**: `0`

### [](#batching-check)`batching.check`

A [Bloblang query](https://docs.redpanda.com/connect/guides/bloblang/about/) that returns a boolean value indicating whether a message should end a batch.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
check: this.type == "end_of_transaction"
```

### [](#batching-count)`batching.count`

The number of messages after which the batch is flushed. Set to `0` to disable count-based batching.

**Type**: `int`

**Default**: `0`

### [](#batching-period)`batching.period`

The period of time after which an incomplete batch is flushed regardless of its size. This field accepts Go duration format strings such as `100ms`, `1s`, or `5s`.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
period: 1s

# ---

period: 1m

# ---

period: 500ms
```

### [](#batching-processors)`batching.processors[]`

A list of [processors](https://docs.redpanda.com/connect/components/processors/about/) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. All resulting messages are flushed as a single batch, and therefore splitting the batch into smaller batches using these processors is a no-op.

**Type**: `processor`

```yaml
# Examples:
processors:
  - archive:
      format: concatenate

# ---

processors:
  - archive:
      format: lines

# ---

processors:
  - archive:
      format: json_array
```

### [](#checkpoint_cache)`checkpoint_cache`

A [cache resource](https://docs.redpanda.com/connect/components/caches/about/) to store the current Log Sequence Number (LSN) position. This enables the connector to resume from the last processed position after restarts, preventing data loss and duplicate processing. The cache stores the highest LSN that has been successfully delivered downstream.

**Type**: `string`

### [](#checkpoint_cache_connection_string)`checkpoint_cache_connection_string`

An optional connection string for a remote Microsoft SQL Server to use for the checkpoint cache. When set, this creates the checkpoint cache table on the remote server instead of the source database. If `checkpoint_cache` is also set, that takes precedence.

**Type**: `string`

```yaml
# Examples:
checkpoint_cache_connection_string: sqlserver://username:password@remotehost/instance?param1=value&param2=value
```

### [](#checkpoint_cache_key)`checkpoint_cache_key`

The key to use to store the snapshot position in `checkpoint_cache`. An alternative key can be provided if multiple CDC inputs share the same cache.

**Type**: `string`

**Default**: `microsoft_sql_server_cdc`

### [](#checkpoint_cache_table_name)`checkpoint_cache_table_name`

The multipart identifier for the checkpoint cache table name. If no `checkpoint_cache` field is specified, this input will automatically create a table and stored procedure under the `rpcn` schema to act as a checkpoint cache. This table stores the latest processed Log Sequence Number (LSN) that has been successfully delivered, allowing Redpanda Connect to resume from that point upon restart rather than reconsume the entire change table.

**Type**: `string`

**Default**: `rpcn.CdcCheckpointCache`

```yaml
# Examples:
checkpoint_cache_table_name: dbo.checkpoint_cache
```

### [](#checkpoint_limit)`checkpoint_limit`

The maximum number of messages that can be processed concurrently before applying back pressure. Higher values enable better parallelization and batching but increase memory usage. Messages are processed in LSN order, and a given LSN is only acknowledged after all previous LSNs have been successfully delivered, ensuring at-least-once guarantees.

**Type**: `int`

**Default**: `1024`

### [](#connection_string)`connection_string`

The connection string for the Microsoft SQL Server database. Use the format `sqlserver://username:password@host/instance?param1=value&param2=value`. For Windows Authentication, use `sqlserver://host/instance?trusted_connection=yes`. Include additional parameters like `TrustServerCertificate=true` for self-signed certificates or `encrypt=disable` to disable encryption.

**Type**: `string`

```yaml
# Examples:
connection_string: sqlserver://username:password@host/instance?param1=value&param2=value
```

### [](#exclude)`exclude[]`

Regular expressions for tables to exclude from CDC streaming. Use this to filter out specific tables from the include patterns. Table names should follow the `schema.table` format. Exclude patterns are applied after include patterns, allowing you to include broad patterns while excluding specific tables.

**Type**: `array`

```yaml
# Examples:
exclude: dbo.privatetable
```

### [](#include)`include[]`

Regular expressions for tables to include in CDC streaming. Specify table names using the format `schema.table` (such as `dbo.orders`, `sales.customers`). Each pattern is treated as a regular expression, allowing wildcards and pattern matching. All specified tables must have CDC enabled in SQL Server.

**Type**: `array`

```yaml
# Examples:
include: dbo.products
```

### [](#max_parallel_snapshot_tables)`max_parallel_snapshot_tables`

Specifies a number of tables that will be processed in parallel during the snapshot processing stage.

**Type**: `int`

**Default**: `1`

### [](#snapshot_max_batch_size)`snapshot_max_batch_size`

The maximum number of rows to stream in a single batch during the initial snapshot phase. Larger batch sizes can improve throughput for initial data loads but may increase memory usage. This setting only applies when `stream_snapshot` is enabled.

**Type**: `int`

**Default**: `1000`

### [](#stream_backoff_interval)`stream_backoff_interval`

The time interval to wait between polling attempts when no new CDC data is available. For low-traffic tables, increasing this value reduces database load and network traffic. Use Go duration format like `5s`, `30s`, or `1m`. Shorter intervals provide lower latency for new changes but increase server load.

**Type**: `string`

**Default**: `5s`

```yaml
# Examples:
stream_backoff_interval: 5s

# ---

stream_backoff_interval: 1m
```

### [](#stream_snapshot)`stream_snapshot`

Whether to stream a snapshot of all existing data before streaming CDC changes. When enabled, the connector first queries all existing table data, then switches to streaming incremental changes from the transaction log. Set to `false` to start streaming only new changes from the current LSN position.

**Type**: `bool`

**Default**: `false`

```yaml
# Examples:
stream_snapshot: true
```