Docs Connect Components Inputs file file Type: InputCacheOutput Available in: Self-Managed Consumes data from files on disk, emitting messages according to a chosen codec. Common Advanced # Common config fields, showing default values input: label: "" file: paths: [] # No default (required) scanner: lines: {} auto_replay_nacks: true # All config fields, showing default values input: label: "" file: paths: [] # No default (required) scanner: lines: {} delete_on_finish: false auto_replay_nacks: true Metadata This input adds the following metadata fields to each message: - path - mod_time_unix - mod_time (RFC3339) You can access these metadata fields using function interpolation. Fields paths A list of paths to consume sequentially. Glob patterns are supported, including super globs (double star). Type: array scanner The scanner by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the csv scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once. Type: scanner Default: {"lines":{}} Requires version 4.25.0 or newer delete_on_finish Whether to delete input files from the disk once they are fully consumed. Type: bool Default: false auto_replay_nacks Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation. Type: bool Default: true Examples Read a Bunch of CSVs If we wished to consume a directory of CSV files as structured documents we can use a glob pattern and the csv scanner: input: file: paths: [ ./data/*.csv ] scanner: csv: {} Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution dynamic gcp_bigquery_select