switch

Select a child scanner dynamically for source data based on factors such as the filename.

# Config fields, showing default values
switch: [] # No default (required)

This scanner outlines a list of potential child scanner candidates to be chosen, and for each source of data the first candidate to pass will be selected. A candidate without any conditions acts as a catch-all and will pass for every source, it is recommended to always have a catch-all scanner at the end of your list. If a given source of data does not pass a candidate an error is returned and the data is rejected.

Fields

[].re_match_name

A regular expression to test against the name of each source of data fed into the scanner (filename or equivalent). If this pattern matches the child scanner is selected.

Type: string

[].scanner

The scanner to activate if this candidate passes.

Type: scanner

Examples

  • Switch based on file name

In this example a file input chooses a scanner based on the extension of each file

input:
  file:
    paths: [ ./data/* ]
    scanner:
      switch:
        - re_match_name: '\.avro$'
          scanner: { avro: {} }

        - re_match_name: '\.csv$'
          scanner: { csv: {} }

        - re_match_name: '\.csv.gz$'
          scanner:
            decompress:
              algorithm: gzip
              into:
                csv: {}

        - re_match_name: '\.tar$'
          scanner: { tar: {} }

        - re_match_name: '\.tar.gz$'
          scanner:
            decompress:
              algorithm: gzip
              into:
                tar: {}

        - scanner: { to_the_end: {} }