Programmable Push Filters

You can use push-down filters in Redpanda Cloud to search for specific records within a Kafka topic.

Push-down filters are TypeScript/JavaScript function bodies that you define in Redpanda Cloud and that are executed on the backend for every individual record in a topic. The code must return a boolean. If your code returns true, the backend sends the record to the results in the frontend. Otherwise the record is skipped and Redpanda Cloud continues to consume records until either the selected number of maximum search results or the end of the topic has been reached.

On a topic’s Messages page, click Add filter > JavaScript Filter.

Redpanda Cloud can inject the following properties into your function, which you can use in your filter code:

  • partitionId - The record’s partition ID

  • offset - The record’s offset within its partition

  • key - The record’s key in its decoded form

  • value - The record’s value in its decoded form

  • headers - The record’s header value in its decoded form

Keys, values, and headers are passed into your JavaScript code in their decoded form. The deserialization logic (for example, decode an Avro serialized byte array to a JSON object) is applied first, before injecting it into the JavaScript function. If your record is presented as a JSON object in the UI, you can also access it like a JavaScript object in your filter code.

Suppose you have a series of Avro, JSON, or Protobuf encoded record values that deserialize to JSON objects like this:

{
  "event_type": "BASKET_ITEM_ADDED",
  "event_id": "777036dd-1bac-499c-993a-8cc86cee3ccc"
  "item": {
    "id": "895e443a-f1b7-4fe5-ad66-b9adfe5420b9",
    "name": "milk"
  }
}
return value.item.id == "895e443a-f1b7-4fe5-ad66-b9adfe5420b9"

When the filter function returns true, the record is sent to the front end. If you use more than one filter function at the same time, filters are combined with a logical AND, so records must pass every filter. The offset specified also is effectively combined using an AND operator.

Resource usage and performance

You can use the filter engine against topics with millions of records, as the filter code is evaluated in the backend where more resources are available. However, while the filter engine is fairly efficient, it could potentially consume all available CPU resources and cause significant network traffic due to the number of consumed Kafka records.

Usually, performance is constrained by available CPU resources. Depending on the JavaScript code and the records, the expected performance is around 15,000 -20,000 filtered records per second for each available core. The request is only processed on a single instance of Redpanda Cloud and cannot be shared across multiple instances.