cassandra
Executes a find query and creates a message for each row received.
-
Common
-
Advanced
inputs:
label: ""
cassandra:
addresses: [] # No default (required)
timeout: 600ms
reconnect_interval: 60s
query: "" # No default (required)
auto_replay_nacks: true
inputs:
label: ""
cassandra:
addresses: [] # No default (required)
tls:
enabled: false
skip_cert_verify: false
enable_renegotiation: false
root_cas: ""
root_cas_file: ""
client_certs: []
password_authenticator:
enabled: false
username: ""
password: ""
disable_initial_host_lookup: false
max_retries: 3
backoff:
initial_interval: 1s
max_interval: 5s
timeout: 600ms
host_selection_policy:
local_dc: "" # No default (optional)
local_rack: "" # No default (optional)
reconnect_interval: 60s
exponential_reconnection:
max_retries: "" # No default (required)
initial_interval: "" # No default (required)
max_interval: "" # No default (required)
query: "" # No default (required)
auto_replay_nacks: true
Examples
Minimal Select (Cassandra/Scylla)
Let’s presume that we have 3 Cassandra nodes, like in this tutorial by Sebastian Sigl from freeCodeCamp:
Then if we want to select everything from the table users_by_country, we should use the configuration below. If we specify the stdin output, the result will look like:
{"age":23,"country":"UK","first_name":"Bob","last_name":"Sandler","user_email":"bob@email.com"}
This configuration also works for Scylla.
input:
cassandra:
addresses:
- 172.17.0.2
query:
'SELECT * FROM learn_cassandra.users_by_country'
Fields
addresses[]
A list of Cassandra nodes to connect to. Multiple comma separated addresses can be specified on a single line.
Type: array
# Examples:
addresses:
- "localhost:9042"
# ---
addresses:
- "foo:9042"
- "bar:9042"
# ---
addresses:
- "foo:9042,bar:9042"
auto_replay_nacks
Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation.
Type: bool
Default: true
backoff.initial_interval
The initial period to wait between retry attempts. The retry interval increases for each failed attempt, up to the backoff.max_interval value. This field accepts Go duration format strings such as 100ms, 1s, or 5s.
Type: string
Default: 1s
disable_initial_host_lookup
If enabled the driver will not attempt to get host info from the system.peers table. This can speed up queries but will mean that data_centre, rack and token information will not be available.
Type: bool
Default: false
exponential_reconnection
Configure exponential backoff for reconnection attempts to DOWN nodes. When enabled, this replaces the driver’s default constant reconnection policy with an exponential backoff strategy that gradually increases the delay between reconnection attempts. This reduces connection storm scenarios during widespread outages while ensuring eventual recovery.
Requires version 4.66.0 or later.
Type: object
exponential_reconnection.initial_interval
The initial period to wait between retry attempts.
Type: string
exponential_reconnection.max_interval
The maximum period to wait between retry attempts.
Type: string
host_selection_policy
Advanced host selection policy settings for Cassandra clusters. Use these options to optimize query routing in multi-datacenter (DC) and multi-rack deployments. By specifying a local DC and rack, you can ensure queries are directed to the closest nodes, reducing latency and improving fault tolerance. If not set, the default policy is round-robin across all available nodes. Host selection is always token-aware if the token can be calculated from query.
Requires version 4.61.0 or later.
Type: object
# Examples:
host_selection_policy:
local_dc: dc-east
local_rack: rack1
host_selection_policy.local_dc
The name of the local datacenter to prioritize for query routing. Enables DC-aware host selection, ensuring queries are sent to nodes within this datacenter whenever possible. Recommended for clusters spanning multiple datacenters to minimize cross-DC traffic.
Type: string
host_selection_policy.local_rack
The name of the local rack to prioritize for query routing. Requires local_dc to be set. Enables rack-aware host selection, further optimizing query placement within the specified datacenter. Useful for deployments with multiple racks per datacenter to improve resilience and reduce intra-DC latency.
Type: string
password_authenticator.password
The password to authenticate with.
|
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Secrets. |
Type: string
Default: ""
reconnect_interval
The interval at which Redpanda Connect attempts to reconnect to Cassandra nodes that are marked as DOWN. This setting helps maintain connectivity in unstable network conditions or during node maintenance. Use Go duration format such as 30s, 1m, or 5m. Setting this too low may create unnecessary connection attempts, while setting it too high may delay recovery from network issues.
Requires version 4.66.0 or later.
Type: string
Default: 60s
tls.client_certs[]
A list of client certificates to use. For each certificate either the fields cert and key, or cert_file and key_file should be specified, but not both.
Type: object
Default: []
# Examples:
client_certs:
- cert: foo
key: bar
# ---
client_certs:
- cert_file: ./example.pem
key_file: ./example.key
tls.client_certs[].key
A plain text certificate key to use.
|
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Secrets. |
Type: string
Default: ""
tls.client_certs[].password
A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC algorithm is not supported for the PKCS#8 format.
Because the obsolete pbeWithMD5AndDES-CBC algorithm does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext.
|
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Secrets. |
Type: string
Default: ""
# Examples:
password: foo
# ---
password: ${KEY_PASSWORD}
tls.enable_renegotiation
Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you’re seeing the error message local error: tls: no renegotiation.
Requires version 3.45.0 or later.
Type: bool
Default: false
tls.root_cas
An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate.
|
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Secrets. |
Type: string
Default: ""
# Examples:
root_cas: |-
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
tls.root_cas_file
An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate.
Type: string
Default: ""
# Examples:
root_cas_file: ./root_cas.pem