# Guardrail Policy Reference

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [agentic-data-plane-full.txt](https://docs.redpanda.com/agentic-data-plane-full.txt)

---
title: Guardrail Policy Reference
latest-operator-version: v26.1.5
latest-console-tag: v3.7.4
latest-connect-version: 4.96.1
latest-redpanda-tag: v26.1.10
docname: guardrails/types-reference
page-component-name: agentic-data-plane
page-version: master
page-component-version: master
page-component-title: Agentic Data Plane
page-relative-src-path: guardrails/types-reference.adoc
page-edit-url: https://github.com/redpanda-data/adp-docs/edit/main/modules/control/pages/guardrails/types-reference.adoc
description: Reference for the guardrail policy types, their configuration fields, actions, directions, and limits.
page-topic-type: reference
personas: security_compliance_lead, platform_engineer
page-git-created-date: "2026-05-28"
page-git-modified-date: "2026-06-11"
---

<!-- Source: https://docs.redpanda.com/agentic-data-plane/control/guardrails/types-reference.md -->

A guardrail bundles a set of policies, each backed by AWS Bedrock Guardrails. Each policy is optional, but a guardrail must enforce at least one. This page documents each policy type’s configuration fields, available actions, direction settings, and regional availability.

## [](#common-settings)Common settings

Two settings recur across policies:

-   Action: what the policy does when it matches. `None` detects and records the match in the trace without intervening. `Block` stops the request and returns the configured blocked message. The sensitive-information policy adds `Anonymize`.

-   Direction: most policies evaluate input, output, or both, and you set the action per direction. Some policies are fixed to one direction (noted below).


> 📝 **NOTE**
>
> Feature availability varies by AWS region. Choose a region that supports the policies you need, and see the [AWS Bedrock Guardrails documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) for exhaustive behavior and regional support.

## [](#content-filters)Content filters

Classify prompts and responses against harmful-content categories and block or detect them per category.

| Field | Description |
| --- | --- |
| Categories | Hate, Insults, Sexual, Violence, Misconduct, and Prompt attack. Configure each category independently. Prompt-attack detection evaluates input only. |
| Strength | Per category and direction. Sets the confidence cutoff for a match: None scores the category in the trace without acting, Low matches only high-confidence content, Medium matches medium-confidence and above, and High matches any non-negligible content. Higher strength is stricter. |
| Action | Per category and direction: None (detect) or Block. |
| Modality | Text or Image. |

## [](#word-filters)Word filters

Block or detect exact words and phrases.

| Field | Description |
| --- | --- |
| Custom words | Your own list of words and phrases to match. |
| Managed lists | Platform-managed lists. Profanity is available today. |
| Action | Per direction (input and output): None (detect) or Block. Set independently for custom words and for each managed list. |

## [](#denied-topics)Denied topics

Block content by meaning rather than exact words, so the policy catches paraphrases and misspellings. A policy holds up to 30 topics.

| Field | Description |
| --- | --- |
| Name | Topic name, 1 to 100 characters. |
| Definition | What the topic covers, up to 1000 characters. The definition drives the semantic match, so write it as a clear, self-contained statement. Keep example phrases and negations out of the definition; put concrete examples in the Examples field instead, which improves accuracy. |
| Examples | Up to five example phrases, each up to 100 characters, that match the topic. |
| Action | Per direction (input and output): None (detect) or Block. |

## [](#sensitive-information)Sensitive information

Detect personally identifiable information (PII) by built-in entity type or by your own regular expressions, then detect, block, or anonymize it.

| Field | Description |
| --- | --- |
| Entities | Built-in entity types. Each entity has a per-direction action. |
| Regexes | Custom patterns. Each rule has a name (1 to 100 characters), an RE2 pattern (1 to 500 characters; lookaround is not supported), an optional description, and a per-direction action. |
| Action | Per direction (input and output): None (detect), Block, or Anonymize. Anonymize replaces each match in place with its entity type, such as {EMAIL}, and applies to text only. The two directions differ: on output, ADP delivers the redacted response to the caller. On input, this release does not forward the redacted prompt to the model. Instead, an anonymize match short-circuits the request like a block: the model is never called, and ADP returns your configured blocked input message rather than the redacted prompt. Block replaces the whole payload with the blocked message. |

The built-in entity types are: `ADDRESS`, `AGE`, `NAME`, `EMAIL`, `PHONE`, `USERNAME`, `PASSWORD`, `DRIVER_ID`, `LICENSE_PLATE`, `VEHICLE_IDENTIFICATION_NUMBER`, `CREDIT_DEBIT_CARD_CVV`, `CREDIT_DEBIT_CARD_EXPIRY`, `CREDIT_DEBIT_CARD_NUMBER`, `PIN`, `INTERNATIONAL_BANK_ACCOUNT_NUMBER`, `SWIFT_CODE`, `IP_ADDRESS`, `MAC_ADDRESS`, `URL`, `AWS_ACCESS_KEY`, `AWS_SECRET_KEY`, `US_BANK_ACCOUNT_NUMBER`, `US_BANK_ROUTING_NUMBER`, `US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER`, `US_PASSPORT_NUMBER`, `US_SOCIAL_SECURITY_NUMBER`, `CA_HEALTH_NUMBER`, `CA_SOCIAL_INSURANCE_NUMBER`, `UK_NATIONAL_HEALTH_SERVICE_NUMBER`, `UK_NATIONAL_INSURANCE_NUMBER`, and `UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER`.

## [](#contextual-grounding)Contextual grounding

For retrieval-augmented generation (RAG) applications, check model output against a source and the user’s query. This policy evaluates output only and has two independent sub-filters.

| Sub-filter | Description |
| --- | --- |
| Grounding | Checks that the response is factually grounded in the provided source. |
| Relevance | Checks that the response is relevant to the user’s query. |

Each sub-filter has its own enable toggle, a threshold between `0.0` and `0.99`, and an action. The action fires when the response scores below the threshold, so higher thresholds are stricter. The action is `None` (detect) or `Block`.

## [](#automated-reasoning)Automated reasoning

Mathematically verify model output against formal Bedrock Automated Reasoning policies. This policy is detect-only: it never blocks, and its findings appear in the trace.

| Field | Description |
| --- | --- |
| Policy ARNs | One or two Bedrock Automated Reasoning policy Amazon Resource Names (ARNs) to attach. Each ARN must point at a specific numeric version (for example, ending in :1 or :2); the DRAFT version is rejected. Create and publish the policies in the AWS Bedrock console first, then reference their versioned ARNs here. |
| Confidence threshold | A value between 0.0 and 1.0. Below this confidence, a finding is reported as non-definitive. |

## [](#next-steps)Next steps

-   [Create a guardrail](https://docs.redpanda.com/agentic-data-plane/control/guardrails/create-guardrail/)

-   [Review blocked requests](https://docs.redpanda.com/agentic-data-plane/control/guardrails/violations/)

-   [How guardrails work](https://docs.redpanda.com/agentic-data-plane/control/guardrails/overview/)