Docs Self-Managed Develop Data Transforms Deploy This is documentation for Self-Managed v24.2. To view the latest available version of the docs, see v24.3. Deploy Data Transforms Learn how to build, deploy, share, and troubleshoot data transforms in Redpanda. Prerequisites Before you begin, ensure that you have the following: Data transforms enabled in your Redpanda cluster. The rpk command-line client installed on your host machine and configured to connect to your Redpanda cluster. A data transform project. Build the Wasm binary To build a Wasm binary: Ensure your project directory contains a transform.yaml file. Build the Wasm binary using the rpk transform build command. rpk transform build You should now have a Wasm binary named <transform-name>.wasm, where <transform-name> is the name specified in your transform.yaml file. This binary is your data transform function, ready to be deployed to a Redpanda cluster or hosted on a network for others to use. Deploy the Wasm binary You can deploy your transform function using the rpk transform deploy command. Validate your setup against the pre-deployment checklist: Do you meet the Prerequisites? Does your transform function access any environment variables? If so, make sure to set them in the transform.yaml file or in the command-line when you deploy the binary. Do your configured input and output topics already exist? Input and output topics must exist in your Redpanda cluster before you deploy the Wasm binary. Deploy the Wasm binary: rpk transform deploy When the transform function reaches Redpanda, it starts processing new records that are written to the input topic. Reprocess records In some cases, you may need to reprocess records from an input topic that already contains data. Processing existing records can be useful, for example, to process historical data into a different format for a new consumer, to re-create lost data from a deleted topic, or to resolve issues with a previous version of a transform that processed data incorrectly. To reprocess records, you can specify the starting point from which the transform function should process records in each partition of the input topic. The starting point can be either a partition offset or a timestamp. The --from-offset flag is only effective the first time you deploy a transform function. On subsequent deployments of the same function, Redpanda resumes processing from the last committed offset. To reprocess existing records using an existing function, delete the function and redeploy it with the --from-offset flag. To deploy a transform function and start processing records from a specific partition offset, use the following syntax: rpk transform deploy --from-offset +/-<offset> In this example, the transform function will start processing records from the beginning of each partition of the input topic: rpk transform deploy --from-offset +0 To deploy a transform function and start processing records from a specific timestamp, use the following syntax: rpk transform deploy --from-timestamp @<unix-timestamp> In this example, the transform function will start processing from the first record in each partition of the input topic that was committed after the given timestamp: rpk transform deploy --from-timestamp @1617181723 Share Wasm binaries You can also deploy data transforms on a Redpanda cluster by providing an addressable path to the Wasm binary. This is useful for sharing transform functions across multiple clusters or teams within your organization. For example, if the Wasm binary is hosted at https://my-site/my-transform.wasm, use the following command to deploy it: rpk transform deploy --file=https://my-site/my-transform.wasm Edit existing transform functions To make changes to an existing transform function: Make your changes to the code. Rebuild the Wasm binary. Redeploy the Wasm binary to the same Redpanda cluster. When you redeploy a Wasm binary with the same name, it will resume processing from the last offset it had previously processed. If you need to reprocess existing records, you must delete the transform function, and redeploy it with the --from-offset flag. Deploy-time configuration overrides must be provided each time you redeploy a Wasm binary. Otherwise, they will be overwritten by default values or the configuration file’s contents. Delete a transform function To delete a transform function, use the following command: rpk transform delete <transform-name> For more details about this command, see rpk transform delete. You can also delete transform functions in Redpanda Console. Troubleshoot This section provides guidance on how to diagnose and troubleshoot issues with building or deploying data transforms. Invalid transform environment This error means that one or more of your configured custom environment variables are invalid. Check your custom environment variables against the list of limitations. Invalid WebAssembly This error indicates that the binary is missing a required callback function: Invalid WebAssembly - the binary is missing required transform functions. Check the broker support for the version of the data transforms SDK being used. All transform functions must register a callback with the OnRecordWritten() method. For more details, see Develop Data Transforms. Next steps Set up monitoring for data transforms. Suggested labs Flatten JSON MessagesConvert JSON Messages into AvroTransform JSON Messages into a New Topic using JQFilter Messages into a New Topic using a RegexConvert Timestamps using RustRedact Information in JSON MessagesSee moreSearch all labs Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution Configure Test