Agentic Data Plane

Configure an LLM Provider

Create an LLM provider to give your applications a managed proxy URL: Redpanda handles the upstream API keys, forwards requests to the provider, and records usage for you. Create a provider for each upstream you use, whether that’s OpenAI, Anthropic, Google AI, AWS Bedrock, or an OpenAI-compatible endpoint.

After reading this page, you will be able to:

Create an LLM provider for OpenAI, Anthropic, Google AI, AWS Bedrock, or an OpenAI-compatible endpoint
Select the models you want to expose through the provider
Verify the provider is reachable using the built-in Test Connection control

Prerequisites

An API key (or AWS credentials for Bedrock) for the upstream provider you want to configure.
One or more secrets already created in your dataplane’s secret store for the provider’s credentials. Secret references must use UPPER_SNAKE_CASE. For example: OPENAI_API_KEY, ANTHROPIC_API_KEY, AWS_ACCESS_KEY_ID.

Open the Create LLM provider page

Open LLM Providers in the sidebar.
Click Add provider.

Fill in the provider card

The first card on the page collects identity fields. Enter a Display name; the form auto-derives the Resource ID from it as you type.

Field Required Notes

Field	Required	Notes
`Display name`	Yes	Human-readable label shown in dashboards and model selectors. Up to 253 characters. The form auto-derives the `Resource ID` from this value.
`Resource ID`	Yes	Machine identifier used in API calls and CLI commands. Lowercase letters, numbers, and hyphens only (`^[a-z][a-z0-9-]*$`), up to 63 characters. Immutable after creation. Appears in the proxy URL (`/llm/v1/providers/<resource-id>/…`). Auto-derived from the `Display name`; edit it if you want something different.

Display name

Yes

Human-readable label shown in dashboards and model selectors. Up to 253 characters. The form auto-derives the Resource ID from this value.

Resource ID

Yes

Machine identifier used in API calls and CLI commands. Lowercase letters, numbers, and hyphens only (^[a-z][a-z0-9-]*$), up to 63 characters. Immutable after creation. Appears in the proxy URL (/llm/v1/providers/<resource-id>/…). Auto-derived from the Display name; edit it if you want something different.

The Summary panel labels the Resource ID as Name.

Choose a provider type

The Provider type card shows five cards. Pick the one that matches your upstream.

Type Use when

Type	Use when
OpenAI	Proxy GPT, o-series, and embeddings through the OpenAI API. Best when you already hold an OpenAI API key or want the broadest GPT model catalog.
Anthropic	Call Claude Opus, Sonnet, and Haiku directly. Strong at coding, long-context reasoning, and tool use. Supports forwarding client `Authorization` headers to Anthropic for enterprise and Max-plan subscription passthrough (see Anthropic: Authorization passthrough).
Google AI	Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs.
AWS Bedrock	Invoke foundation models (Claude, Llama, Titan, Nova, Mistral, AI21 Jamba, Gemma) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). Supports the native Bedrock APIs (`InvokeModel`, `Converse`) and an OpenAI-compatible Chat Completions endpoint for `gpt-oss` and Gemma models. See AWS Bedrock: Inference profiles and IAM for picking the right model identifier, and Set up AWS Bedrock as an LLM provider for a step-by-step IAM and access-key walkthrough.
OpenAI-compatible	Point at any OpenAI-compatible endpoint that ships `/v1/chat/completions` (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways. Requires a `Base URL`. Authentication is optional.

OpenAI

Proxy GPT, o-series, and embeddings through the OpenAI API. Best when you already hold an OpenAI API key or want the broadest GPT model catalog.

Anthropic

Call Claude Opus, Sonnet, and Haiku directly. Strong at coding, long-context reasoning, and tool use. Supports forwarding client Authorization headers to Anthropic for enterprise and Max-plan subscription passthrough (see Anthropic: Authorization passthrough).

Google AI

Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs.

AWS Bedrock

Invoke foundation models (Claude, Llama, Titan, Nova, Mistral, AI21 Jamba, Gemma) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). Supports the native Bedrock APIs (InvokeModel, Converse) and an OpenAI-compatible Chat Completions endpoint for gpt-oss and Gemma models. See AWS Bedrock: Inference profiles and IAM for picking the right model identifier, and Set up AWS Bedrock as an LLM provider for a step-by-step IAM and access-key walkthrough.

OpenAI-compatible

Point at any OpenAI-compatible endpoint that ships /v1/chat/completions (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways. Requires a Base URL. Authentication is optional.

Selecting a type reveals the type-specific configuration fields.

Fill in the type-specific configuration

Each API key reference and credential field points at a secret-store entry, not the secret value itself. Use the Existing tab to pick a secret already in your dataplane’s secret store, or the New tab to create one inline.

OpenAI
Anthropic
Google AI
AWS Bedrock
OpenAI-compatible

Field Notes

Field	Notes
`Base URL`	Optional. Leave empty for the standard OpenAI API (`https://api.openai.com/v1`). Override for Azure OpenAI or other OpenAI-hosted endpoints.
`API key reference`	Required. Secret-store reference for the OpenAI API key. Must be `UPPER_SNAKE_CASE`, for example `OPENAI_API_KEY`.

Base URL

Optional. Leave empty for the standard OpenAI API (https://api.openai.com/v1). Override for Azure OpenAI or other OpenAI-hosted endpoints.

API key reference

Required. Secret-store reference for the OpenAI API key. Must be UPPER_SNAKE_CASE, for example OPENAI_API_KEY.

Field Notes

Field	Notes
`Base URL`	Optional. Leave empty for the standard Anthropic API (`https://api.anthropic.com`).
`API key reference`	Required unless `Authorization passthrough` is on. `UPPER_SNAKE_CASE`, for example `ANTHROPIC_API_KEY`.
`Authorization passthrough`	Optional toggle. When on, AI Gateway forwards the client’s `Authorization` header to Anthropic instead of using a server-side API key. Used for enterprise and Max-plan OAuth passthrough: each client authenticates with its own Anthropic subscription. Leave the API key reference empty when using passthrough.

Base URL

Optional. Leave empty for the standard Anthropic API (https://api.anthropic.com).

API key reference

Required unless Authorization passthrough is on. UPPER_SNAKE_CASE, for example ANTHROPIC_API_KEY.

Authorization passthrough

Optional toggle. When on, AI Gateway forwards the client’s Authorization header to Anthropic instead of using a server-side API key. Used for enterprise and Max-plan OAuth passthrough: each client authenticates with its own Anthropic subscription. Leave the API key reference empty when using passthrough.

Field Notes

Field	Notes
`Base URL`	Optional. Leave empty for the standard Google AI API (`https://generativelanguage.googleapis.com`).
`API key reference`	Required. Secret-store reference for the Google AI API key. `UPPER_SNAKE_CASE`, for example `GOOGLE_AI_API_KEY`.

Base URL

Optional. Leave empty for the standard Google AI API (https://generativelanguage.googleapis.com).

API key reference

Required. Secret-store reference for the Google AI API key. UPPER_SNAKE_CASE, for example GOOGLE_AI_API_KEY.

Gemini uses the x-goog-api-key header for authentication, not Authorization: Bearer. This matters when you wire up clients. See Connect your app to AI Gateway.

Field Notes

Field	Notes
`Region`	Required. AWS region where the Bedrock endpoint is deployed, for example `us-east-1`.
`Base URL`	Optional. Override the default regional Bedrock endpoint.
`Credential type`	How AI Gateway authenticates to Bedrock: Default chain, Static keys, or Assume IAM role. The fields below depend on the mode you pick.
`Access key ID reference`	Static keys only. Secret-store reference for the AWS access key ID, `UPPER_SNAKE_CASE` (typically `AWS_ACCESS_KEY_ID`).
`Secret access key reference`	Static keys only. Secret-store reference for the AWS secret access key, `UPPER_SNAKE_CASE` (typically `AWS_SECRET_ACCESS_KEY`).
`Role ARN`	Assume IAM role only. Required. ARN of the IAM role AI Gateway assumes through AWS STS, for example `arn:aws:iam::123456789012:role/BedrockRole`.
`External ID`	Assume IAM role only. Optional. External ID for cross-account role assumption. Set it only when the role’s trust policy mandates an external ID.
`Session name`	Assume IAM role only. Optional. Session name that appears in AWS CloudTrail audit logs, for example `redpanda-adp`.
`Guardrail`	Optional. Name of a guardrail to attach to this provider, or empty for none. Only the Bedrock provider type exposes this setting. AI Gateway validates the name when you save: it rejects a guardrail that doesn’t exist or is being deleted, so set the field to an existing guardrail or leave it empty. See Create a guardrail.

Region

Required. AWS region where the Bedrock endpoint is deployed, for example us-east-1.

Base URL

Optional. Override the default regional Bedrock endpoint.

Credential type

How AI Gateway authenticates to Bedrock: Default chain, Static keys, or Assume IAM role. The fields below depend on the mode you pick.

Access key ID reference

Static keys only. Secret-store reference for the AWS access key ID, UPPER_SNAKE_CASE (typically AWS_ACCESS_KEY_ID).

Secret access key reference

Static keys only. Secret-store reference for the AWS secret access key, UPPER_SNAKE_CASE (typically AWS_SECRET_ACCESS_KEY).

Role ARN

Assume IAM role only. Required. ARN of the IAM role AI Gateway assumes through AWS STS, for example arn:aws:iam::123456789012:role/BedrockRole.

External ID

Assume IAM role only. Optional. External ID for cross-account role assumption. Set it only when the role’s trust policy mandates an external ID.

Session name

Assume IAM role only. Optional. Session name that appears in AWS CloudTrail audit logs, for example redpanda-adp.

Guardrail

Optional. Name of a guardrail to attach to this provider, or empty for none. Only the Bedrock provider type exposes this setting. AI Gateway validates the name when you save: it rejects a guardrail that doesn’t exist or is being deleted, so set the field to an existing guardrail or leave it empty. See Create a guardrail.

Pick a Credential type to control how AI Gateway authenticates to Bedrock:

Default chain (default): Leave the credentials unset to use the AWS SDK’s default provider chain (environment variables, shared config, EKS Pod Identity, IRSA, or instance profile). Use this when the gateway already runs with an AWS identity.
Static keys: An access key pair stored in the secret store. Use this when no ambient AWS identity is available. This is the path the Bedrock setup guide walks through.
Assume IAM role: AI Gateway assumes an IAM role through AWS STS. Use this for cross-account access or when your security policy requires short-lived credentials.

Field Notes

Field	Notes
`Base URL`	Required. URL of your OpenAI-compatible endpoint, for example `http://vllm.internal:8000/v1`, `http://ollama.local:11434/v1`, or an aggregator like Together / Groq / OpenRouter.
`API key reference`	Optional. Leave empty for endpoints with no authentication (common for local runtimes). `UPPER_SNAKE_CASE` if set.

Base URL

Required. URL of your OpenAI-compatible endpoint, for example http://vllm.internal:8000/v1, http://ollama.local:11434/v1, or an aggregator like Together / Groq / OpenRouter.

API key reference

Optional. Leave empty for endpoints with no authentication (common for local runtimes). UPPER_SNAKE_CASE if set.

OpenAI-compatible endpoints can serve any model. Enter the exact model identifiers your upstream server exposes (for example, meta-llama/Llama-3.3-70B-Instruct or qwen3:8b).

For the OpenAI, Google AI, and AWS Bedrock provider types, AI Gateway validates that the credential references resolve before it accepts the create or update. AI Gateway rejects a missing or empty secret reference at save time instead of failing at first call. The OpenAI-compatible type does not require a credential reference, so it can be created with no authentication for local runtimes such as Ollama or vLLM.

Select models

Models you select on this form become the catalog the provider exposes. Leave the list empty to allow every model the upstream catalog returns.

For OpenAI, Anthropic, Google AI, and AWS Bedrock, the form shows a picker backed by the provider’s catalog. Each model in the picker shows its input and output price per million tokens. Pick from the list, or type a model identifier the catalog doesn’t show. For OpenAI-compatible, the form takes a freeform list: type the exact identifiers your upstream serves.

Redpanda maintains the catalog of available models in the picker. When an upstream provider publishes a new model, it usually appears in the picker within a day or two; admins don’t have to wait for a Redpanda release. New models aren’t enabled automatically: an admin still selects the model in the catalog to make it callable through this provider.

For Bedrock, the picker exposes inference profiles, not raw foundation-model IDs. See AWS Bedrock: Inference profiles and IAM.

Redpanda stores models as structured ProviderModel entries (one entry per model, with the model name as the only required field). Each model can carry custom pricing overrides that replace the catalog rates for that model in cost reporting; see Override per-model pricing. The legacy flat models field still works on writes for backward compatibility.

Beyond pricing, the catalog carries each model’s capabilities and context-window limits. The model discovery API (the ListModels and GetModel methods on ModelService) reports max_input_tokens (the largest context the model accepts) and max_output_tokens (the most it can generate in a single response) for each model. Both are read-only catalog metadata: a limit the catalog doesn’t declare stays unset rather than reported as zero. Clients read max_input_tokens to show how full an agent’s context window is, such as the context-window indicator on the agent’s Inspector tab.

Override per-model pricing

Cost reporting prices each call at the catalog rates for the model. If your organization negotiates non-standard rates, or you track spend against an internal chargeback rate, override the rates per model on this provider.

In the model picker, each selected model carries a pencil icon (Override pricing). Click it to open the pricing dialog for that model. The dialog lists one field per billing bucket, in the same order as the provider’s published rate card:

Bucket	What it bills
Input	Per 1M input tokens. Tool-use input also bills at this rate.
Output	Per 1M output tokens. Reasoning tokens also bill at this rate.
Cached input	Per 1M tokens read from prompt cache.
Cache write (5-minute TTL)	Per 1M tokens written to a 5-minute prompt cache.
Cache write (1-hour TTL)	Per 1M tokens written to a 1-hour prompt cache.

Bucket

What it bills

Input

Per 1M input tokens. Tool-use input also bills at this rate.

Output

Per 1M output tokens. Reasoning tokens also bill at this rate.

Cached input

Per 1M tokens read from prompt cache.

Cache write (5-minute TTL)

Per 1M tokens written to a 5-minute prompt cache.

Cache write (1-hour TTL)

Per 1M tokens written to a 1-hour prompt cache.

Enter rates in dollars per million tokens. Each field is independent:

Leave a field blank to keep the catalog rate for that bucket. The catalog rate shows as the field’s placeholder.
Enter a positive value to replace the catalog rate for that bucket only.
Enter 0 to make that bucket explicitly free, which is different from leaving it blank.

Cache writes with an unknown TTL always bill at the catalog rate; they have no override field.

Use the reset control on a field to clear a single override, or clear every field to drop all overrides for the model. Overrides are scoped to this provider and model, and they change what Agentic Data Plane’s cost reporting computes, not what the upstream provider actually charges you.

Explore the provider detail page

After you create the provider, its detail page organizes everything about it into tabs: Overview, Models, Connect, Activity, and Settings.

Overview

The Overview tab carries a Last 7 days KPI strip (TOTAL SPEND, REQUESTS, TOKENS), where each card has a sparkline and a View more link that opens a metric detail drawer, and the Connection card (provider type, status, authentication passthrough state, proxy URL, upstream base URL, and the API key secret reference). The Delete this provider control sits at the bottom of this tab. For analysis across providers, use the Cost & Usage page under Governance (see View cost and usage).

Models

The Models tab lists the chat models available on this provider. Each row shows the model’s name and identifier, icons for the capabilities it supports (such as vision, tools, JSON, or reasoning), its context-window limit, and its input and output prices per million tokens. Hover over a capability icon to read what that capability means. Use the toggle on a row to enable or disable that model on the provider. Select a model to open its detail page (see View a model’s detail page).

Enabled models sort to the top. Use the search box to find a model by display name or identifier, and the Status and Capability filters to narrow the list.

Connect

The Connect tab generates ready-made client configuration for this provider: a gateway-token step, ready-to-run rpk ai setup steps, setup instructions for popular clients such as Claude Code, and code examples in several languages, all with the provider’s proxy URL prefilled. For a Bedrock provider, the Claude Code snippets run against the gateway without local AWS credentials, because the gateway signs the upstream AWS requests for you. See Connect your app to AI Gateway for the underlying flow.

Activity

The Activity tab shows recent agent activity scoped to this provider. It ranks the agents that sent the most traffic through the provider over a time window you select, and, where transcripts are recorded, lets you open an agent’s recent requests and drill into individual transcripts. Transcripts are recorded per agent rather than per provider, so an agent’s transcript list can include requests it sent to other providers.

Settings

The Settings tab shows read-only identification (the provider’s display name and resource identifier) and the Secret Store references it uses to authenticate. From here you can open the Secret Store to rotate or replace those secret values.

View a model’s detail page

From the Models tab, select a model to open its detail page. The page gathers what the catalog knows about that model on this provider in one place:

An overview strip with the model’s context window (largest input it accepts), maximum output (most it can generate in a single response), and price (input and output rates per million tokens). Each value appears only when the catalog declares it.
Model details: The model ID and the provider’s proxy URL, each with a copy control.
Pricing: A read-only rate card of the model’s effective per-bucket rates, that is, the catalog rates with any per-model overrides applied (see Override per-model pricing). This section appears only when rates are known, and links to the upstream provider’s pricing reference.
Capabilities: The model’s capabilities from the catalog, with a link to the upstream provider’s capabilities reference. This section appears only when the catalog declares capabilities.
Usage: Spend, Requests, and Tokens cards for traffic routed to this model through this provider over the last 7 days. Select View more on a card to open a metric detail drawer.

Configure transcript logging

The Transcripts card controls whether AI Gateway records the message bodies this provider proxies. It has two independent toggles, both off by default:

Toggle What it captures

Toggle	What it captures
`Record inputs`	Captures the full request body (prompt content and tool-call arguments) on observability traces.
`Record outputs`	Captures the full response body (completion content and tool-call results) on observability traces.

Record inputs

Captures the full request body (prompt content and tool-call arguments) on observability traces.

Record outputs

Captures the full response body (completion content and tool-call results) on observability traces.

Because both toggles default to off, AI Gateway does not retain message bodies for a new provider until you turn them on. Enable them to power turn-by-turn investigation and per-conversation drill-down in the Transcripts view. Leave them off for workloads where the message body must not be retained, such as regulated PII or customer secrets.

These are per-provider settings, not per-request: applications cannot opt in or out at call time. To split sensitive from non-sensitive traffic, create one provider with recording on and another with it off, and route each application to whichever proxy URL matches its data class.

Recording settings do not affect cost and usage telemetry. Token counts, latency, and provider/model attribution are always recorded, so the Cost & Usage page reports spend for traffic on the provider regardless of these toggles; only the message bodies are withheld when the toggles are off.

Changing a toggle takes effect for new requests. Transcripts already captured under the previous setting are not retroactively redacted; delete or rotate the provider if you need to purge historical content.

Save and verify

Click Create provider. The button activates after Name and Type are both set. The Summary panel checks them off as you fill them in.
On the provider’s detail page, the Connection card shows your Proxy URL, Discovery URL, Base URL, and API key ref. Copy the Proxy URL: this is where your applications point.
Scroll to the Verify connection section. Pick a model from the dropdown and click Test Connection. The status updates from Not tested yet to a pass/fail indicator. Use the Show commands disclosure if you want to see the equivalent curl or SDK call.
To wire up an application, open Connect your app further down the page or follow Connect your app to AI Gateway.

A successful Test Connection result confirms that the provider’s credentials, region (Bedrock), and network path are all correct. If the call fails, see Troubleshooting.

AWS Bedrock: Inference profiles and IAM

Bedrock has three concepts that affect how you configure a provider: foundation models, cross-region inference profiles, and IAM. Get these right and the Test connection check passes. Get them wrong and you see AccessDenied or ValidationException errors.

Foundation models versus inference profiles

A foundation model is the base model AWS exposes (for example, anthropic.claude-sonnet-4-6). It runs in the AWS region you call.

A cross-region inference profile wraps a foundation model with a geography prefix that routes requests across multiple regions for higher availability and throughput. The prefix tells AWS which geography the request should run in:

Prefix Geography

Prefix	Geography
`us.`	US regions
`eu.`	EU regions
`apac.`	Asia-Pacific regions
`au.`	Australia regions
`jp.`	Japan regions
`global.`	Any region; routes for lowest cost

us.

US regions

eu.

EU regions

apac.

Asia-Pacific regions

au.

Australia regions

jp.

Japan regions

global.

Any region; routes for lowest cost

Examples: us.anthropic.claude-sonnet-4-6 (Claude Sonnet 4.6 routed across US regions), eu.anthropic.claude-haiku-4-5 (Haiku 4.5 routed across EU regions).

Anthropic Claude 4.6+ models (Sonnet 4.6, Opus 4.6, Opus 4.7) cannot be invoked with the bare foundation-model ID; they require an inference profile. If you try the bare ID, Bedrock returns:

"Invocation of model ID … with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model."

Older 4.5 and earlier Claude models still accept bare IDs.

Pricing varies by profile. The bare foundation-model ID and the global. profile share AWS’s headline rate; geo profiles (us., eu., apac., au., jp.) carry approximately a 10% cross-region inference premium. Use global. when you want the headline rate and don’t need a specific geography. Use us. / eu. / apac. when data residency matters.

AI Gateway preserves the regional prefix end to end when it records spend, so the Cost & Usage page attributes usage to the correct regional rate. A call to eu.anthropic.claude-haiku-4-5 is billed at the EU Haiku rate, not the headline foundation-model rate.

IAM policy patterns

Bedrock IAM resources have different ARN structures depending on whether you reference a foundation model, a system-defined inference profile, or an account-scoped application inference profile. The provider’s IAM principal needs bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream on every resource it calls.

Resource type ARN shape

Resource type	ARN shape
Foundation model	`arn:aws:bedrock:{region}::foundation-model/{model-id}` (no account ID; AWS-owned)
System-defined inference profile	`arn:aws:bedrock:{region}:*:inference-profile/{profile-id}` (wildcard account; system-defined)
Application inference profile (account-scoped)	`arn:aws:bedrock:{region}:{account-id}:application-inference-profile/{profile-id}`

Foundation model

arn:aws:bedrock:{region}::foundation-model/{model-id} (no account ID; AWS-owned)

System-defined inference profile

arn:aws:bedrock:{region}:*:inference-profile/{profile-id} (wildcard account; system-defined)

Application inference profile (account-scoped)

arn:aws:bedrock:{region}:{account-id}:application-inference-profile/{profile-id}

A minimal policy granting access to all foundation models plus all cross-region profiles:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
    "Resource": [
      "arn:aws:bedrock:*::foundation-model/*",
      "arn:aws:bedrock:*:*:inference-profile/*"
    ]
  }]
}

For production, scope to specific models and regions instead of using wildcards.

Anthropic: Authorization passthrough

If you want each client to authenticate against Anthropic with its own subscription (Claude Pro, Max, Team, or enterprise), enable Authorization passthrough instead of configuring a server-side API key. In this mode:

Leave the API key field empty.
Clients must send their own Anthropic Authorization header with every request. AI Gateway forwards it unchanged.
Use this when you want to aggregate individual client subscriptions rather than share a single API account.

The provider detail page shows whether Authorization passthrough is enabled in the Connection card.

Browse providers in the list view

The LLM Providers list page is the at-a-glance home for every provider in your dataplane. Open it from the sidebar’s LLM Providers entry.

Column What it shows

Column	What it shows
`Provider`	The provider-type icon (OpenAI, Anthropic, Google, AWS Bedrock, or OpenAI-compatible), the display name, and the resource identifier beneath it. Select the provider name to open its detail page.
`Status`	`Active` for an enabled provider. A disabled provider rejects requests to its proxy URL until you enable it again.
`Models`	The model identifiers configured on the provider, shown as chips. A provider with no models configured shows a dash.
`24h requests`	Request count over the last 24 hours.
`30d spend`	Spend over the last 30 days. For longer-range or cross-provider analysis, use the Cost & Usage page under Governance (see View cost and usage).

Provider

The provider-type icon (OpenAI, Anthropic, Google, AWS Bedrock, or OpenAI-compatible), the display name, and the resource identifier beneath it. Select the provider name to open its detail page.

Status

Active for an enabled provider. A disabled provider rejects requests to its proxy URL until you enable it again.

Models

The model identifiers configured on the provider, shown as chips. A provider with no models configured shows a dash.

24h requests

Request count over the last 24 hours.

30d spend

Spend over the last 30 days. For longer-range or cross-provider analysis, use the Cost & Usage page under Governance (see View cost and usage).

Use the search box to find a provider by name, and the Filter button to narrow the list by provider type, model, or status. The Add provider button opens the create flow described in Open the Create LLM provider page. Each row’s actions menu can activate or deactivate the provider, copy its proxy or base URL, or delete it. The list paginates, with a rows-per-page selector in the footer.

View cost and usage

The Cost & Usage page tracks spend, request volume, and token volume over time across providers and models. Open Cost & Usage under Governance in the sidebar. Use it when you want to understand which provider or model generated usage during a selected time window.

The page includes these charts:

Spend over time: Estimated spend in USD for the selected range.
Requests over time: Request count for the selected range.
Tokens over time: Token count for the selected range.

Use Group by to switch the chart breakdown between providers, models, token type, agents, and users. Group by provider to see which upstream consumed the most budget. Group by model to see which model drove spend inside one or more providers. Group by token type to separate input, output, cached, cache-write, and reasoning usage where those buckets apply. Group by agent or user to attribute spend to the agent or the identified caller that drove it.

Use Filter to narrow the charts by provider, model, cost type, token type, user, or agent. Each filter appears as a chip above the chart, and you can combine them. For example, filter to one Anthropic provider, drill into claude-opus-4-7, then limit the spend view to input tokens. Selecting an agent also narrows the provider options to the providers that agent used.

The date-range picker supports last 7 days, last 14 days, last 30 days, last 90 days, month to date, quarter to date, year to date, and custom ranges. The chart subtitle shows the selected date range and bucket size.

A custom range writes customStart and customEnd ISO-8601 timestamps to the page URL, so the view is shareable: copy the URL after picking a custom range and any teammate who opens it lands on the same window.

The chart renders empty buckets in the selected range as zero-height bars rather than gaps, so quiet days line up with their date label and the trend stays readable when traffic is bursty.

The chart palette is colorblind-safe. When multiple providers of the same type exist (for example, two OpenAI providers), the chart renders each one with a distinct hatched pattern so the series stay visually distinguishable.

The spend chart footer summarizes the selected view by cost bucket, including total, input, output, cached, cache writes, and reasoning when the selected traffic includes those categories.

Download the report as CSV

To export the data behind the charts, use the download controls in the toolbar:

Click Download full report to download the complete, untruncated dataset for the current filters and date range as a CSV file. Every cost type and token type is a separate column.
Use the report options control (the sliders icon next to Download full report) to choose how the exported rows are grouped. Under Time bucket, pick Hourly, Daily, Monthly, or Total (no time column). Under Break down by, select any combination of Provider, Model, User, and Agent.

Each chart also has its own export that downloads the data as currently shown, including any truncation the chart applies to keep the top series readable. Use Download full report when you need the whole dataset rather than the charted view.

Edit, disable, or delete a provider

Edit: Click Edit on the detail page. You can change any field except Name and Type, which are immutable. Model lists, credential references, and the enabled state can all change.
Disable: Click Disable on the detail page. The provider remains in the list, but requests to its proxy URL are rejected until you enable it again. Use this when you want to pause traffic without losing configuration.
Delete: Scroll to the Delete this provider section at the bottom of the detail page and click Delete. The action is permanent. In-flight requests fail and downstream clients receive errors until reconfigured.

Troubleshooting

Symptom What to check

Symptom	What to check
`secret "<NAME>" not found`	Confirm the secret exists in your dataplane’s secret store and the reference in the provider configuration is spelled identically (`UPPER_SNAKE_CASE`, no typos).
Bedrock returns `AccessDenied` or region errors	Verify the AWS region field matches the region where your Bedrock models are enabled. Bedrock model availability varies by region. Confirm the IAM principal has `bedrock:InvokeModel` on the foundation-model and inference-profile ARNs you use. See AWS Bedrock: Inference profiles and IAM.
Bedrock returns "Invocation of model ID … with on-demand throughput isn’t supported"	You called a Claude 4.6+ model with a bare foundation-model ID. Switch to an inference profile (for example, `us.anthropic.claude-sonnet-4-6` instead of `anthropic.claude-sonnet-4-6`). See AWS Bedrock: Inference profiles and IAM.
Anthropic returns 401 when passthrough is enabled	Confirm the client is sending its own `Authorization` header and the `API key` field on the provider is empty.
Gemini returns 401	Gemini uses the `x-goog-api-key` header, not `Authorization`. If you’re seeing 401s on Gemini, check that the client is sending the correct header. See Connect your app to AI Gateway.
Provider list empty or 403	Confirm your account has the `dataplane_adp_llmprovider_*` permissions in Agentic Data Plane. The Reader built-in role is the minimum required to list providers. The Writer role is required to create one. See LLM provider permissions.

secret "<NAME>" not found

Confirm the secret exists in your dataplane’s secret store and the reference in the provider configuration is spelled identically (UPPER_SNAKE_CASE, no typos).

Bedrock returns AccessDenied or region errors

Verify the AWS region field matches the region where your Bedrock models are enabled. Bedrock model availability varies by region. Confirm the IAM principal has bedrock:InvokeModel on the foundation-model and inference-profile ARNs you use. See AWS Bedrock: Inference profiles and IAM.

Bedrock returns "Invocation of model ID … with on-demand throughput isn’t supported"

You called a Claude 4.6+ model with a bare foundation-model ID. Switch to an inference profile (for example, us.anthropic.claude-sonnet-4-6 instead of anthropic.claude-sonnet-4-6). See AWS Bedrock: Inference profiles and IAM.

Anthropic returns 401 when passthrough is enabled

Confirm the client is sending its own Authorization header and the API key field on the provider is empty.

Gemini returns 401

Gemini uses the x-goog-api-key header, not Authorization. If you’re seeing 401s on Gemini, check that the client is sending the correct header. See Connect your app to AI Gateway.

Provider list empty or 403

Confirm your account has the dataplane_adp_llmprovider_* permissions in Agentic Data Plane. The Reader built-in role is the minimum required to list providers. The Writer role is required to create one. See LLM provider permissions.

Limitations

AI Gateway does not provide these capabilities. For current status, see the Agentic Data Plane release notes.

Multi-provider routing, failover, and retries across providers. A synthetic provider that fans requests to multiple upstreams is not part of AI Gateway.
Rate limits. Requests-per-second, per-minute, or per-day limits are not available. To cap spend rather than request rate, use budgets, which enforce a per-agent hard cap.
Managed MCP aggregation at the gateway. Register MCP tool servers separately under MCP Servers in Agentic Data Plane.

Next steps

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution

What do you think of this page?

Let us know more:

Let us contact you about your feedback:

Configure an LLM Provider

Prerequisites

Open the Create LLM provider page

Fill in the provider card

Choose a provider type

Fill in the type-specific configuration

Select models

Override per-model pricing

Explore the provider detail page

Overview

Models

Connect

Activity

Settings

View a model’s detail page

Configure transcript logging

Save and verify

AWS Bedrock: Inference profiles and IAM

Foundation models versus inference profiles

IAM policy patterns

Anthropic: Authorization passthrough

Browse providers in the list view

View cost and usage

Download the report as CSV

Edit, disable, or delete a provider

Troubleshooting

Limitations

Next steps

Simple online edits

Contribution guide