Monitor Data Transforms

This topic provides guidelines on how to monitor the health of your data transforms and view logs.

Prerequisites

Performance

You can identify performance bottlenecks by monitoring latency and CPU usage:

If latency is high, investigate the transform logic for inefficiencies or consider scaling the resources. High CPU usage might indicate the need for optimization in the code or an increase in allocated CPU resources.

Reliability

Tracking execution errors and error states helps in maintaining the reliability of your data transforms:

Make sure to implement robust error handling and logging within your transform functions to help with troubleshooting.

Resource usage

Monitoring memory usage metrics and total execution time ensures that the Wasm engine does not exceed allocated resources, helping in efficient resource management:

If memory usage is consistently high or exceeds the maximum allocated memory:

Review and optimize your transform functions to reduce memory consumption. This step can involve optimizing data structures, reducing memory allocations, and ensuring efficient handling of records.
Consider increasing the allocated memory for the Wasm engine. Adjust the data_transforms_per_core_memory_reservation and data_transforms_per_function_memory_limit settings to provide more memory to each function and the overall Wasm engine.

Throughput

Keeping track of read and write bytes and processor lag helps in understanding the data flow through your transforms, enabling better capacity planning and scaling:

If there is a significant lag or low throughput, investigate potential bottlenecks in the data flow or consider scaling your infrastructure to handle higher throughput.

View logs for data transforms

Runtime logs for transform functions are written to an internal topic called _redpanda.transform_logs. You can read these logs by using the rpk transform logs command.

rpk transform logs <transform-name>

Replace <transform-name> with the configured name of the transform function.

You can also view logs in Redpanda Console.

By default, Redpanda provides several settings to manage logging for data transforms, such as buffer capacity, flush interval, and maximum log line length. These settings ensure that logging operates efficiently without overwhelming the system. However, you may need to adjust these settings based on your specific requirements and workloads. For information on how to configure logging, see the Configure transform logging section of the configuration guide.

Suggested labs

Search all labs

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution

What do you think of this page?

Let us know more:

Let us contact you about your feedback:

Monitor Data Transforms

Prerequisites

Performance

Reliability

Resource usage

Throughput

View logs for data transforms

Suggested reading

Suggested labs

Simple online edits

Contribution guide