Agentic Data Plane

How AI Gateway Works

AI Gateway is Redpanda Agentic Data Plane’s managed proxy for LLM APIs. Instead of giving every application a provider API key and letting it call the upstream directly, you create an LLM provider in Agentic Data Plane and point your applications at a Redpanda-hosted proxy URL. Redpanda handles the upstream credentials, forwards the request, and records usage. Your code continues to use the provider’s native SDK.

After reading this page, you will be able to:

Describe what AI Gateway is and how a managed proxy differs from direct upstream calls
Explain how LLM providers, secrets, and OIDC authentication fit together in AI Gateway
Identify use cases where AI Gateway fits, and use cases where it does not

The problem AI Gateway solves

Teams adopting LLMs can quickly hit operational problems:

Credential sprawl: Every team that touches an LLM gets its own API key. Rotation is manual, offboarding is manual, and it’s hard to know who’s using what.
SDK lock-in and switching cost: Each provider has its own SDK, authentication scheme, and model catalog. Swapping OpenAI for Anthropic means a code change, not a configuration change.
No shared view of usage: Provider dashboards tell you what a single API key spent. They don’t tell you what your organization spent, broken down by team or application.

What AI Gateway gives you

AI Gateway consolidates provider access behind the following capabilities.

Traffic stays in your VPC

LLM requests are proxied through your dataplane’s AI Gateway. API keys are stored in your dataplane’s secret store and never leave your infrastructure. Upstream calls leave your VPC only when the LLM provider is third-party (OpenAI, Anthropic, Google AI). Self-hosted OpenAI-compatible endpoints stay entirely inside your network.

Centralized secrets

The upstream API key (or AWS credentials for Bedrock) lives in the Redpanda secret store and is attached to the provider at configuration time. Your application never sees it; rotation happens in one place.

A managed proxy URL per provider

Every provider you create has its own URL of the form <gateway-base>/llm/v1/providers/<provider-name>/<upstream-path>. Your application points its SDK at this URL instead of the upstream, continues to use the provider’s native API, and authenticates to Redpanda with a short-lived OIDC access token. The gateway base is a cluster-specific subdomain (for example, aigw.<cluster-id>.clusters.rdpa.co). Copy the exact value from the Proxy URL field on any provider’s detail page.

Native SDK compatibility

Use the provider’s own SDK: OpenAI, Anthropic, Google AI, AWS Bedrock, or any OpenAI-compatible client (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). AI Gateway does not require a single unified SDK. It forwards native requests to the native upstream.

Managed authentication

Applications authenticate to Agentic Data Plane with OIDC service accounts instead of long-lived provider API keys. Service accounts use the same role and audit model as every other Agentic Data Plane resource, and mint short-lived tokens that are easy to revoke. For local command-line workflows, use rpk ai to sign in (rpk ai auth login) and talk to the gateway. CI and programmatic clients use the OIDC client-credentials grant directly. See Connect your app to AI Gateway.

Per-provider observability

The provider’s detail page in Agentic Data Plane records spend, request counts, and token counts for the last 7 days. The Cost & Usage page under Governance expands that view with time-series charts, provider and model grouping, date ranges, and filters for provider, model, cost type, token type, user, and agent.

What’s in the UI

In Agentic Data Plane (ai.redpanda.com) you’ll find these areas:

Home: The landing page after sign-in. A snapshot of items that need attention (such as a budget over its cap or a disabled resource, with quick actions to resolve them), recent request and spend activity, budget status, and counts of your Agentic Data Plane resources with quick links into each area.
Agents: Create and manage Redpanda-hosted agents and registered self-managed agents.
Guardrails: Define content policies, word filters, and PII rules that apply to traffic on a Bedrock provider.
LLM Providers: Create, edit, enable, and delete providers. This is the home of AI Gateway configuration.
MCP Servers: Register MCP tool servers for agents. Separate from the AI Gateway proxy URL.
My Connections: Each user’s own per-user OAuth connections to the configured providers.
Integrations setup: Admin home for the OAuth plumbing behind MCP tools. The Outbound providers tab registers upstream identity providers for user-delegated MCP authentication (for example, GitHub or Google); the Inbound clients tab registers external tools (Claude.ai, ChatGPT, Cursor) that request access tokens from the gateway.
Secrets Store: Create and manage the secrets that providers, MCP servers, and agents reference by name.
Governance: The Cost & Usage page for spend, request, and token analysis across providers, models, and agents, and the Budgets page for per-agent spend caps.

LLM Providers is where you configure provider settings. The others are covered by their own docs.

Supported providers

AI Gateway supports the following provider types. The UI labels and short descriptions match the picker on the Create LLM provider page.

Type Typical upstream

Type	Typical upstream
OpenAI	Proxy GPT, o-series, and embeddings through the OpenAI API. Best when you already hold an OpenAI API key or want the broadest GPT model catalog.
Anthropic	Call Claude Opus, Sonnet, and Haiku directly. Optionally forwards the client’s `Authorization` header for enterprise and Max-plan subscription passthrough.
Google AI	Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs.
AWS Bedrock	Invoke foundation models (Claude, Llama, Titan, Nova) hosted inside your AWS account. Use when data residency, IAM, or VPC egress matter more than raw feature parity. Signed with SigV4 server-side by AI Gateway.
OpenAI-compatible	Point at any OpenAI-compatible endpoint (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways that ship `/v1/chat/completions`.

OpenAI

Proxy GPT, o-series, and embeddings through the OpenAI API. Best when you already hold an OpenAI API key or want the broadest GPT model catalog.

Anthropic

Call Claude Opus, Sonnet, and Haiku directly. Optionally forwards the client’s Authorization header for enterprise and Max-plan subscription passthrough.

Google AI

Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs.

AWS Bedrock

Invoke foundation models (Claude, Llama, Titan, Nova) hosted inside your AWS account. Use when data residency, IAM, or VPC egress matter more than raw feature parity. Signed with SigV4 server-side by AI Gateway.

OpenAI-compatible

Point at any OpenAI-compatible endpoint (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways that ship /v1/chat/completions.

See Configure an LLM provider for the full form reference for each type.

When to use AI Gateway

AI Gateway is a good fit when you want to:

Pull provider API keys out of application code and manage them centrally.
Keep LLM traffic inside your dataplane’s VPC and your secrets out of application code.
Authenticate applications to LLMs using the same OIDC identity you use for other Agentic Data Plane resources.
Run a self-hosted OpenAI-compatible endpoint (vLLM, Ollama, LM Studio) alongside 1P providers behind a single management plane.
Separate operator and developer roles. Operators configure providers and credentials; developers point at proxy URLs.

It is not the right fit when you:

Only ever call a single provider with a single API key and are happy managing that key inline.
Need routing, failover, or cross-provider load balancing across providers. AI Gateway does not provide these capabilities.

Limitations

AI Gateway does not provide these capabilities. For current status, see the Agentic Data Plane release notes.

Multi-provider routing, failover, and retries. A synthetic provider that fans requests to multiple upstreams is not part of AI Gateway.
Rate limits. Requests-per-second, per-minute, or per-day caps are not available. To cap spend rather than request rate, use budgets, which enforce a per-agent hard cap.
Managed MCP aggregation at the gateway. Register MCP tool servers separately under MCP Servers in Agentic Data Plane.

Next steps

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution