Docs Cloud Agentic AI AI Gateway Architecture AI Gateway Architecture Page options Copy as Markdown Copied! View as plain text Ask AI about this topic Add MCP server to VS Code Redpanda Agentic Data Plane is supported on BYOC clusters running with AWS and Redpanda version 25.3 and later. It is currently in a limited availability release. This page provides technical details about AI Gateway’s architecture, request processing, and capabilities. For an overview of AI Gateway, see What is an AI Gateway? Architecture overview AI Gateway consists of a control plane for configuration and management, a data plane for request processing and routing, and an observability plane for monitoring and analytics. Control plane The control plane manages gateway configuration and policy definition: Workspace management: Multi-tenant isolation with separate namespaces for different teams or environments Provider configuration: Enable and configure LLM providers (such as OpenAI and Anthropic) Gateway creation: Define gateways with specific routing rules, budgets, and rate limits Policy definition: Create CEL-based routing policies, spend limits, and rate limits MCP server registration: Configure which MCP servers are available to agents Data plane The data plane handles all runtime request processing: Request ingestion: Accept requests via OpenAI-compatible API endpoints Authentication: Validate API keys and gateway access Policy evaluation: Apply rate limits, spend limits, and routing policies Provider pool management: Select primary or fallback providers based on availability MCP proxy: Aggregate tools from multiple MCP servers with deferred loading Response transformation: Normalize provider-specific responses to OpenAI format Metrics collection: Record token usage, latency, and cost for every request Observability plane The observability plane provides monitoring and analytics: Request logs: Store full request/response history with prompt and completion content Metrics aggregation: Calculate token usage, costs, latency percentiles, and error rates Dashboard UI: Display real-time and historical analytics per gateway, model, or provider Cost tracking: Estimate spend based on provider pricing and token consumption Request lifecycle When a request flows through AI Gateway, it passes through several policy and routing stages before reaching the LLM provider. Understanding this lifecycle helps you configure policies effectively and troubleshoot issues: Application sends request to gateway endpoint Gateway authenticates request Rate limit policy evaluates (allow/deny) Spend limit policy evaluates (allow/deny) Routing policy evaluates (which model/provider to use) Provider pool selects backend (primary/fallback) Request forwarded to LLM provider Response returned to application Request logged with tokens, cost, latency, status Each policy evaluation happens synchronously in the request path. If rate limits or spend limits reject the request, the gateway returns an error immediately without calling the LLM provider, which helps you control costs. MCP tool request lifecycle For MCP tool requests, the lifecycle differs slightly to support deferred tool loading: Application discovers tools via /mcp endpoint Gateway aggregates tools from approved MCP servers Application receives search + orchestrator tools (deferred loading) Application invokes specific tool Gateway routes to appropriate MCP server Tool execution result returned Request logged with execution time, status The gateway only loads and exposes specific tools when requested, which dramatically reduces the token overhead compared to loading all tools upfront. Next steps AI Gateway Quickstart: Route your first request through AI Gateway MCP Gateway: Configure MCP server aggregation for AI agents Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback! Quickstart Setup Guide