Docs Cloud Redpanda Connect Components Processors parquet_decode parquet_decode Available in: Cloud, Self-Managed Decodes Parquet files into a batch of structured messages. # Configuration fields, showing default values label: "" parquet_decode: handle_logical_types: v1 Fields handle_logical_types Set to v2 to enable enhanced decoding of logical types, or keep the default value (v1) to ignore logical type metadata when decoding values. In Parquet format, logical types are represented using standard physical types along with metadata that provides additional context. For example, UUIDs are stored as a FIXED_LEN_BYTE_ARRAY physical type, but the schema metadata identifies them as UUIDs. By enabling v2, this processor uses the metadata descriptions of logical types to produce more meaningful values during decoding. For backward compatibility, this field enables logical-type handling for the specified Parquet format version, and all earlier versions. When creating new pipelines, Redpanda recommends that you use the newest documented version. Type: string Default: v1 Options: Option Description v1 No special handling of logical types. v2 Logical types with enhanced decoding: TIMESTAMP: Decodes as an RFC3339 string describing the time. If the isAdjustedToUTC flag is set to true in the Parquet file, the time zone is set to UTC. If the flag is set to false, the time zone is set to local time. UUID: Decodes as a string: 00112233-4455-6677-8899-aabbccddeeff. # Examples handle_logical_types: v2 Examples Reading Parquet files from AWS S3 In this example, a pipeline consumes Parquet files as soon as they are uploaded to an AWS S3 bucket. The pipeline listens to an SQS queue for upload events, and uses the to_the_end scanner to read the files into memory in full. The parquet_decode processor then decodes each file into a batch of structured messages. Finally, the data is written to local files in newline-delimited JSON format. input: aws_s3: bucket: TODO prefix: files/ scanner: to_the_end: {} sqs: url: TODO processors: - parquet_decode: handle_logical_types: v2 output: file: codec: lines path: './files/${! meta("s3_key") }.jsonl' Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution parallel parquet_encode