azure_data_lake_gen2

Beta

Sends message parts as files to an Azure Data Lake Gen2 file system. Each file is uploaded with the file name specified in the path field.

# Configuration fields, showing default values
output:
  label: ""
  azure_data_lake_gen2:
    storage_account: "" # No default (optional)
    storage_access_key: "" # No default (optional)
    storage_connection_string: "" # No default (optional)
    storage_sas_token: "" # No default (optional)
    filesystem: messages-${!timestamp("2006")} # No default (required)
    path: ${!counter()}-${!timestamp_unix_nano()}.txt
    max_in_flight: 64

To specify a different path value (file name) for each file, use function interpolations. Function interpolations are calculated for each message in a batch.

Authentication methods

This output supports multiple authentication methods. You must configure at least one method from the following list:

  • storage_connection_string

  • storage_account and storage_access_key

  • storage_account and storage_sas_token

  • storage_account to access using DefaultAzureCredential

If you configure multiple authentication methods, the storage_connection_string takes precedence.

Performance

Sends multiple messages in flight in parallel for improved performance. You can tune the number of in flight messages (or message batches) with the field max_in_flight.

Fields

storage_account

The storage account to access. This field is ignored when the storage_connection_string field is populated.

Type: string

Default: ""

storage_access_key

The access key for the storage account. Use this field along with storage_account for authentication. This field is ignored when the storage_connection_string field is populated.

Type: string

Default: ""

storage_connection_string

The connection string for the storage account. You must enter a value for this field if no other authentication method is specified.

If the storage_connection_string field does not contain the AccountName parameter value, specify it in the storage_account field.

Type: string

Default: ""

storage_sas_token

The SAS token for the storage account. Use this field along with storage_account for authentication. This field is ignored when either the storage_connection_string or storage_access_key fields are populated.

Type: string

Default: ""

filesystem

The name of the data lake storage file system you want to upload messages to. This field supports interpolation functions.

Type: string

# Examples
filesystem: messages-${!timestamp("2006")}

path

The path (file name) of each message to upload to the data lake storage file system. This field supports interpolation functions.

Type: string

Default: ${!counter()}-${!timestamp_unix_nano()}.txt

# Examples
path: ${!counter()}-${!timestamp_unix_nano()}.json
path: ${!meta("kafka_key")}.json
path: ${!json("doc.namespace")}/${!json("doc.id")}.json

max_in_flight

The maximum number of messages to have in flight at a given time. Increase this number to improve throughput until performance plateaus.

Type: int

Default: 64