sftp

Consumes files from an SFTP server.

  • Common

  • Advanced

# Common config fields, showing default values
input:
  label: ""
  sftp:
    address: "" # No default (required)
    credentials:
      username: ""
      password: ""
      private_key_file: ""
      private_key_pass: ""
    paths: [] # No default (required)
    auto_replay_nacks: true
    scanner:
      to_the_end: {}
    watcher:
      enabled: false
      minimum_age: 1s
      poll_interval: 1s
      cache: ""
# All config fields, showing default values
input:
  label: ""
  sftp:
    address: "" # No default (required)
    credentials:
      username: ""
      password: ""
      private_key_file: ""
      private_key_pass: ""
    paths: [] # No default (required)
    auto_replay_nacks: true
    scanner:
      to_the_end: {}
    delete_on_finish: false
    watcher:
      enabled: false
      minimum_age: 1s
      poll_interval: 1s
      cache: ""

Metadata

This input adds the following metadata fields to each message:

  • sftp_path

You can access these metadata fields using function interpolation.

Fields

address

The address of the server to connect to.

Type: string

credentials

The credentials to use to log into the target server.

Type: object

credentials.username

The username to connect to the SFTP server.

Type: string

Default: ""

credentials.password

The password for the username to connect to the SFTP server.

This field contains sensitive information. Review your cluster security before adding it to your configuration.

Type: string

Default: ""

credentials.private_key_file

The private key for the username to connect to the SFTP server.

Type: string

Default: ""

credentials.private_key_pass

Optional passphrase for private key.

This field contains sensitive information. Review your cluster security before adding it to your configuration.

Type: string

Default: ""

paths

A list of paths to consume sequentially. Glob patterns are supported.

Type: array

auto_replay_nacks

Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation.

Type: bool

Default: true

scanner

The scanner by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the csv scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once.

Type: scanner

Default: {"to_the_end":{}}

delete_on_finish

Whether to delete files from the server once they are processed.

Type: bool

Default: false

watcher

An experimental mode whereby the input will periodically scan the target paths for new files and consume them, when all files are consumed the input will continue polling for new files.

Type: object

watcher.enabled

Whether file watching is enabled.

Type: bool

Default: false

watcher.minimum_age

The minimum period of time since a file was last updated before attempting to consume it. Increasing this period decreases the likelihood that a file will be consumed whilst it is still being written to.

Type: string

Default: "1s"

# Examples

minimum_age: 10s

minimum_age: 1m

minimum_age: 10m

watcher.poll_interval

The interval between each attempt to scan the target paths for new files.

Type: string

Default: "1s"

# Examples

poll_interval: 100ms

poll_interval: 1s

watcher.cache

A cache resource for storing the paths of files already consumed.

Type: string

Default: ""