Agentic Data Plane

Configure an LLM Provider

Create an LLM provider to give your applications a managed proxy URL: Redpanda handles the upstream API keys, forwards requests to the provider, and records usage for you. Create a provider for each upstream you use, whether that’s OpenAI, Anthropic, Google AI, AWS Bedrock, or an OpenAI-compatible endpoint.

After reading this page, you will be able to:

  • Create an LLM provider for OpenAI, Anthropic, Google AI, AWS Bedrock, or an OpenAI-compatible endpoint

  • Select the models you want to expose through the provider

  • Verify the provider is reachable using the built-in Test Connection control

Prerequisites

  • An API key (or AWS credentials for Bedrock) for the upstream provider you want to configure.

  • One or more secrets already created in your dataplane’s secret store for the provider’s credentials. Secret references must use UPPER_SNAKE_CASE. For example: OPENAI_API_KEY, ANTHROPIC_API_KEY, AWS_ACCESS_KEY_ID.

Open the Create LLM provider page

  1. Open LLM Providers in the sidebar.

  2. Click Create provider.

Fill in the provider card

The first card on the page collects identity fields. Enter a Display name; the form auto-derives the Resource ID from it as you type.

Field Required Notes

Display name

Yes

Human-readable label shown in dashboards and model selectors. Up to 253 characters. The form auto-derives the Resource ID from this value.

Resource ID

Yes

Machine identifier used in API calls and CLI commands. Lowercase letters, numbers, and hyphens only (^[a-z][a-z0-9-]*$), up to 63 characters. Immutable after creation. Appears in the proxy URL (/llm/v1/providers/<resource-id>/…​). Auto-derived from the Display name; edit it if you want something different.

The Summary panel labels the Resource ID as Name.

Choose a provider type

The Provider type card shows five cards. Pick the one that matches your upstream.

Type Use when

OpenAI

Proxy GPT, o-series, and embeddings through the OpenAI API. Best when you already hold an OpenAI API key or want the broadest GPT model catalog.

Anthropic

Call Claude Opus, Sonnet, and Haiku directly. Strong at coding, long-context reasoning, and tool use. Supports forwarding client Authorization headers to Anthropic for enterprise and Max-plan subscription passthrough (see Anthropic: Authorization passthrough).

Google AI

Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs.

AWS Bedrock

Invoke foundation models (Claude, Llama, Titan, Nova, Mistral, AI21 Jamba) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). Supports the native Bedrock APIs (InvokeModel, Converse) and an OpenAI-compatible Chat Completions endpoint for gpt-oss models. See AWS Bedrock: Inference profiles and IAM for picking the right model identifier, and Set up AWS Bedrock as an LLM provider for a step-by-step IAM and access-key walkthrough.

OpenAI-compatible

Point at any OpenAI-compatible endpoint that ships /v1/chat/completions (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways. Requires a Base URL. Authentication is optional.

Selecting a type reveals the type-specific configuration fields.

Fill in the type-specific configuration

Each API key reference and credential field points at a secret-store entry, not the secret value itself. Use the Existing tab to pick a secret already in your dataplane’s secret store, or the New tab to create one inline.

  • OpenAI

  • Anthropic

  • Google AI

  • AWS Bedrock

  • OpenAI-compatible

Field Notes

Base URL

Optional. Leave empty for the standard OpenAI API (https://api.openai.com/v1). Override for Azure OpenAI or other OpenAI-hosted endpoints.

API key reference

Required. Secret-store reference for the OpenAI API key. Must be UPPER_SNAKE_CASE, for example OPENAI_API_KEY.

Field Notes

Base URL

Optional. Leave empty for the standard Anthropic API (https://api.anthropic.com).

API key reference

Required unless Authorization passthrough is on. UPPER_SNAKE_CASE, for example ANTHROPIC_API_KEY.

Authorization passthrough

Optional toggle. When on, AI Gateway forwards the client’s Authorization header to Anthropic instead of using a server-side API key. Used for enterprise and Max-plan OAuth passthrough: each client authenticates with its own Anthropic subscription. Leave the API key reference empty when using passthrough.

Field Notes

Base URL

Optional. Leave empty for the standard Google AI API (https://generativelanguage.googleapis.com).

API key reference

Required. Secret-store reference for the Google AI API key. UPPER_SNAKE_CASE, for example GOOGLE_AI_API_KEY.

Gemini uses the x-goog-api-key header for authentication, not Authorization: Bearer. This matters when you wire up clients. See Connect your app to AI Gateway.

Field Notes

Region

Required. AWS region where the Bedrock endpoint is deployed, for example us-east-1.

Base URL

Optional. Override the default regional Bedrock endpoint.

Credential type

How AI Gateway authenticates to Bedrock: Default chain, Static keys, or Assume IAM role. The fields below depend on the mode you pick.

Access key ID reference

Static keys only. Secret-store reference for the AWS access key ID, UPPER_SNAKE_CASE (typically AWS_ACCESS_KEY_ID).

Secret access key reference

Static keys only. Secret-store reference for the AWS secret access key, UPPER_SNAKE_CASE (typically AWS_SECRET_ACCESS_KEY).

Role ARN

Assume IAM role only. Required. ARN of the IAM role AI Gateway assumes through AWS STS, for example arn:aws:iam::123456789012:role/BedrockRole.

External ID

Assume IAM role only. Optional. External ID for cross-account role assumption. Set it only when the role’s trust policy mandates an external ID.

Session name

Assume IAM role only. Optional. Session name that appears in AWS CloudTrail audit logs, for example redpanda-adp.

Guardrail

Optional. Name of a guardrail to attach to this provider, or empty for none. Only the Bedrock provider type exposes this setting. AI Gateway validates the name when you save: it rejects a guardrail that doesn’t exist or is being deleted, so set the field to an existing guardrail or leave it empty. See Create a guardrail.

Pick a Credential type to control how AI Gateway authenticates to Bedrock:

  • Default chain (default): Leave the credentials unset to use the AWS SDK’s default provider chain (environment variables, shared config, EKS Pod Identity, IRSA, or instance profile). Use this when the gateway already runs with an AWS identity.

  • Static keys: An access key pair stored in the secret store. Use this when no ambient AWS identity is available. This is the path the Bedrock setup guide walks through.

  • Assume IAM role: AI Gateway assumes an IAM role through AWS STS. Use this for cross-account access or when your security policy requires short-lived credentials.

Field Notes

Base URL

Required. URL of your OpenAI-compatible endpoint, for example http://vllm.internal:8000/v1, http://ollama.local:11434/v1, or an aggregator like Together / Groq / OpenRouter.

API key reference

Optional. Leave empty for endpoints with no authentication (common for local runtimes). UPPER_SNAKE_CASE if set.

OpenAI-compatible endpoints can serve any model. Enter the exact model identifiers your upstream server exposes (for example, meta-llama/Llama-3.3-70B-Instruct or qwen3:8b).

For the OpenAI, Google AI, and AWS Bedrock provider types, AI Gateway validates that the credential references resolve before it accepts the create or update. AI Gateway rejects a missing or empty secret reference at save time instead of failing at first call. The OpenAI-compatible type does not require a credential reference, so it can be created with no authentication for local runtimes such as Ollama or vLLM.

Select models

Models you select on this form become the catalog the provider exposes. Leave the list empty to allow every model the upstream catalog returns.

For OpenAI, Anthropic, Google AI, and AWS Bedrock, the form shows a picker backed by the provider’s catalog. Each model in the picker shows its input and output price per million tokens. Pick from the list, or type a model identifier the catalog doesn’t show. For OpenAI-compatible, the form takes a freeform list: type the exact identifiers your upstream serves.

Redpanda maintains the catalog of available models in the picker. When an upstream provider publishes a new model, it usually appears in the picker within a day or two; admins don’t have to wait for a Redpanda release. New models aren’t enabled automatically: an admin still selects the model in the catalog to make it callable through this provider.

For Bedrock, the picker exposes inference profiles, not raw foundation-model IDs. See AWS Bedrock: Inference profiles and IAM.

Redpanda stores models as structured ProviderModel entries (one entry per model, with the model name as the only required field). Each model can carry custom pricing overrides that replace the catalog rates for that model in cost reporting; see Override per-model pricing. The legacy flat models field still works on writes for backward compatibility.

Override per-model pricing

Cost reporting prices each call at the catalog rates for the model. If your organization negotiates non-standard rates, or you track spend against an internal chargeback rate, override the rates per model on this provider.

In the model picker, each selected model carries a pencil icon (Override pricing). Click it to open the pricing dialog for that model. The dialog lists one field per billing bucket, in the same order as the provider’s published rate card:

Bucket What it bills

Input

Per 1M input tokens. Tool-use input also bills at this rate.

Output

Per 1M output tokens. Reasoning tokens also bill at this rate.

Cached input

Per 1M tokens read from prompt cache.

Cache write (5-minute TTL)

Per 1M tokens written to a 5-minute prompt cache.

Cache write (1-hour TTL)

Per 1M tokens written to a 1-hour prompt cache.

Enter rates in dollars per million tokens. Each field is independent:

  • Leave a field blank to keep the catalog rate for that bucket. The catalog rate shows as the field’s placeholder.

  • Enter a positive value to replace the catalog rate for that bucket only.

  • Enter 0 to make that bucket explicitly free, which is different from leaving it blank.

Cache writes with an unknown TTL always bill at the catalog rate; they have no override field.

Use the reset control on a field to clear a single override, or clear every field to drop all overrides for the model. Overrides are scoped to this provider and model, and they change what ADP’s cost reporting computes, not what the upstream provider actually charges you.

After you create the provider, its detail page has two tabs. The Overview tab carries a Last 7 days KPI strip (TOTAL SPEND, REQUESTS, TOKENS) with sparklines and a View more link on each card, the Connection card (provider type, status, authentication passthrough state, proxy URL, upstream base URL, and the API key secret reference), and the model list, where each model shows its input and output prices per million tokens and its spend from requests routed through this provider over the last 7 days. For analysis across providers, use the Cost & Usage page under Governance (see View cost and usage).

The Connect tab generates ready-made client configuration for this provider: a gateway-token step, setup instructions for popular clients such as Claude Code, and code examples in several languages, all with the provider’s proxy URL prefilled. See Connect your app to AI Gateway for the underlying flow.

Configure transcript logging

The Transcripts card controls whether AI Gateway records the message bodies this provider proxies. It has two independent toggles, both off by default:

Toggle What it captures

Record inputs

Captures the full request body (prompt content and tool-call arguments) on observability traces.

Record outputs

Captures the full response body (completion content and tool-call results) on observability traces.

Because both toggles default to off, AI Gateway does not retain message bodies for a new provider until you turn them on. Enable them to power turn-by-turn investigation and per-conversation drill-down in the Transcripts view. Leave them off for workloads where the message body must not be retained, such as regulated PII or customer secrets.

These are per-provider settings, not per-request: applications cannot opt in or out at call time. To split sensitive from non-sensitive traffic, create one provider with recording on and another with it off, and route each application to whichever proxy URL matches its data class.

Recording settings do not affect cost and usage telemetry. Token counts, latency, and provider/model attribution are always recorded, so the Cost & Usage page reports spend for traffic on the provider regardless of these toggles; only the message bodies are withheld when the toggles are off.

Changing a toggle takes effect for new requests. Transcripts already captured under the previous setting are not retroactively redacted; delete or rotate the provider if you need to purge historical content.

Save and verify

  1. Click Create provider. The button activates after Name and Type are both set. The Summary panel checks them off as you fill them in.

  2. On the provider’s detail page, the Connection card shows your Proxy URL, Discovery URL, Base URL, and API key ref. Copy the Proxy URL: this is where your applications point.

  3. Scroll to the Verify connection section. Pick a model from the dropdown and click Test Connection. The status updates from Not tested yet to a pass/fail indicator. Use the Show commands disclosure if you want to see the equivalent curl or SDK call.

  4. To wire up an application, open Connect your app further down the page or follow Connect your app to AI Gateway.

A successful Test Connection result confirms that the provider’s credentials, region (Bedrock), and network path are all correct. If the call fails, see Troubleshooting.

AWS Bedrock: Inference profiles and IAM

Bedrock has three concepts that affect how you configure a provider: foundation models, cross-region inference profiles, and IAM. Get these right and the Test connection check passes. Get them wrong and you see AccessDenied or ValidationException errors.

Foundation models versus inference profiles

A foundation model is the base model AWS exposes (for example, anthropic.claude-sonnet-4-6). It runs in the AWS region you call.

A cross-region inference profile wraps a foundation model with a geography prefix that routes requests across multiple regions for higher availability and throughput. The prefix tells AWS which geography the request should run in:

Prefix Geography

us.

US regions

eu.

EU regions

apac.

Asia-Pacific regions

au.

Australia regions

jp.

Japan regions

global.

Any region; routes for lowest cost

Examples: us.anthropic.claude-sonnet-4-6 (Claude Sonnet 4.6 routed across US regions), eu.anthropic.claude-haiku-4-5 (Haiku 4.5 routed across EU regions).

Anthropic Claude 4.6+ models (Sonnet 4.6, Opus 4.6, Opus 4.7) cannot be invoked with the bare foundation-model ID; they require an inference profile. If you try the bare ID, Bedrock returns:

"Invocation of model ID … with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model."

Older 4.5 and earlier Claude models still accept bare IDs.

Pricing varies by profile. The bare foundation-model ID and the global. profile share AWS’s headline rate; geo profiles (us., eu., apac., au., jp.) carry approximately a 10% cross-region inference premium. Use global. when you want the headline rate and don’t need a specific geography. Use us. / eu. / apac. when data residency matters.

AI Gateway preserves the regional prefix end to end when it records spend, so the Cost & Usage page attributes usage to the correct regional rate. A call to eu.anthropic.claude-haiku-4-5 is billed at the EU Haiku rate, not the headline foundation-model rate.

IAM policy patterns

Bedrock IAM resources have different ARN structures depending on whether you reference a foundation model, a system-defined inference profile, or an account-scoped application inference profile. The provider’s IAM principal needs bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream on every resource it calls.

Resource type ARN shape

Foundation model

arn:aws:bedrock:{region}::foundation-model/{model-id} (no account ID; AWS-owned)

System-defined inference profile

arn:aws:bedrock:{region}:*:inference-profile/{profile-id} (wildcard account; system-defined)

Application inference profile (account-scoped)

arn:aws:bedrock:{region}:{account-id}:application-inference-profile/{profile-id}

A minimal policy granting access to all foundation models plus all cross-region profiles:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
    "Resource": [
      "arn:aws:bedrock:*::foundation-model/*",
      "arn:aws:bedrock:*:*:inference-profile/*"
    ]
  }]
}

For production, scope to specific models and regions instead of using wildcards.

Anthropic: Authorization passthrough

If you want each client to authenticate against Anthropic with its own subscription (Claude Pro, Max, Team, or enterprise), enable Authorization passthrough instead of configuring a server-side API key. In this mode:

  • Leave the API key field empty.

  • Clients must send their own Anthropic Authorization header with every request. AI Gateway forwards it unchanged.

  • Use this when you want to aggregate individual client subscriptions rather than share a single API account.

The provider detail page shows whether Authorization passthrough is enabled in the Connection card.

Browse providers in the list view

The LLM Providers list page is the at-a-glance home for every provider in your dataplane. Open it from the sidebar’s LLM Providers entry.

Column What it shows

Provider

User-given name plus the provider-type icon (OpenAI, Anthropic, Google, AWS Bedrock, OpenAI-compatible) and a copyable preview of the proxy base URL.

Status

Shows Active for an enabled provider. A disabled provider rejects requests to its proxy URL until you enable it again.

Models

First two model identifiers exposed by the provider, plus a +N overflow chip when more are configured.

Spend (7d)

Spend over the last 7 days with a small sparkline. The window is fixed at 7 days on this view. Longer-range analysis runs through the Cost & Usage page under Governance (see View cost and usage).

Updated

Relative timestamp of the last edit.

The Filter button narrows the list by provider type, status, or name. The Create provider button opens the create flow described in Open the Create LLM provider page. The list paginates, with a rows-per-page selector in the footer.

View cost and usage

The Cost & Usage page tracks spend, request volume, and token volume over time across providers and models. Open Cost & Usage under Governance in the sidebar. Use it when you want to understand which provider or model generated usage during a selected time window.

The page includes these charts:

  • Spend over time: Estimated spend in USD for the selected range.

  • Requests over time: Request count for the selected range.

  • Tokens over time: Token count for the selected range.

Use Group by to switch the chart breakdown between providers, models, and token type. Group by provider to see which upstream consumed the most budget. Group by model to see which model drove spend inside one or more providers. Group by token type to separate input, output, cached, cache-write, and reasoning usage where those buckets apply.

Use Filter to narrow the charts by provider, model, cost type, token type, user, or agent. Each filter appears as a chip above the chart, and you can combine them. For example, filter to one Anthropic provider, drill into claude-opus-4-7, then limit the spend view to input tokens. Selecting an agent also narrows the provider options to the providers that agent used.

The date-range picker supports last 7 days, last 14 days, last 30 days, last 90 days, month to date, quarter to date, year to date, and custom ranges. The chart subtitle shows the selected date range and bucket size.

A custom range writes customStart and customEnd ISO-8601 timestamps to the page URL, so the view is shareable: copy the URL after picking a custom range and any teammate who opens it lands on the same window.

The chart renders empty buckets in the selected range as zero-height bars rather than gaps, so quiet days line up with their date label and the trend stays readable when traffic is bursty.

The chart palette is colorblind-safe. When multiple providers of the same type exist (for example, two OpenAI providers), the chart renders each one with a distinct hatched pattern so the series stay visually distinguishable.

The spend chart footer summarizes the selected view by cost bucket, including total, input, output, cached, cache writes, and reasoning when the selected traffic includes those categories.

Edit, disable, or delete a provider

  • Edit: Click Edit on the detail page. You can change any field except Name and Type, which are immutable. Model lists, credential references, and the enabled state can all change.

  • Disable: Click Disable on the detail page. The provider remains in the list, but requests to its proxy URL are rejected until you enable it again. Use this when you want to pause traffic without losing configuration.

  • Delete: Scroll to the Delete this provider section at the bottom of the detail page and click Delete. The action is permanent. In-flight requests fail and downstream clients receive errors until reconfigured.

Troubleshooting

Symptom What to check

secret "<NAME>" not found

Confirm the secret exists in your dataplane’s secret store and the reference in the provider configuration is spelled identically (UPPER_SNAKE_CASE, no typos).

Bedrock returns AccessDenied or region errors

Verify the AWS region field matches the region where your Bedrock models are enabled. Bedrock model availability varies by region. Confirm the IAM principal has bedrock:InvokeModel on the foundation-model and inference-profile ARNs you use. See AWS Bedrock: Inference profiles and IAM.

Bedrock returns "Invocation of model ID … with on-demand throughput isn’t supported"

You called a Claude 4.6+ model with a bare foundation-model ID. Switch to an inference profile (for example, us.anthropic.claude-sonnet-4-6 instead of anthropic.claude-sonnet-4-6). See AWS Bedrock: Inference profiles and IAM.

Anthropic returns 401 when passthrough is enabled

Confirm the client is sending its own Authorization header and the API key field on the provider is empty.

Gemini returns 401

Gemini uses the x-goog-api-key header, not Authorization. If you’re seeing 401s on Gemini, check that the client is sending the correct header. See Connect your app to AI Gateway.

Provider list empty or 403

Confirm your account has the dataplane_adp_llmprovider_* permissions in ADP. The Reader built-in role is the minimum required to list providers. The Writer role is required to create one. See LLM provider permissions.

Limitations

AI Gateway does not provide these capabilities. For current status, consult the ADP release notes.

  • Multi-provider routing, failover, and retries across providers. A synthetic provider that fans requests to multiple upstreams is not part of AI Gateway.

  • Rate limits. Requests-per-second, per-minute, or per-day limits are not available. To cap spend rather than request rate, use budgets, which enforce a per-agent hard cap.

  • Managed MCP aggregation at the gateway. Register MCP tool servers separately under MCP Servers in ADP.