Docs Cloud Agentic AI AI Gateway Quickstart AI Gateway Quickstart Page options Copy as Markdown Copied! View as plain text Ask AI about this topic Add MCP server to VS Code Redpanda Agentic Data Plane is supported on BYOC clusters running with AWS and Redpanda version 25.3 and later. It is currently in a limited availability release. Redpanda AI Gateway keeps your AI-powered applications running and your costs under control by routing all LLM and MCP traffic through a single managed layer with automatic failover and budget enforcement. This quickstart walks you through configuring your first gateway and routing requests through it. Prerequisites Before starting, ensure you have: Access to the AI Gateway UI (provided by your administrator) Admin permissions to configure providers and models API key for at least one LLM provider (OpenAI, Anthropic, or Google AI) Python 3.8+, Node.js 18+, or cURL (for testing) Configure a provider Providers represent upstream LLM services and their associated credentials. Providers are disabled by default and must be enabled explicitly. Navigate to Providers. Select a provider (for example, OpenAI, Anthropic, Google AI). On the Configuration tab, click Add configuration and enter your API key. Verify the provider status shows "Active". Enable models After enabling a provider, enable the specific models you want to make available through your gateways. Navigate to Models. Enable the models you want to use (for example, gpt-5.2-mini, claude-sonnet-4.5, claude-opus-4.6). Verify the models appear as "Enabled" in the model catalog. Different providers have different reliability and cost characteristics. When choosing models, consider your use case requirements for quality, speed, and cost. Model naming convention Requests through AI Gateway must use the vendor/model_id format. For example: OpenAI models: openai/gpt-5.2, openai/gpt-5.2-mini Anthropic models: anthropic/claude-sonnet-4.5, anthropic/claude-opus-4.6 Google Gemini models: google/gemini-2.0-flash, google/gemini-2.0-pro This format allows the gateway to route requests to the correct provider. Create a gateway A gateway is a logical configuration boundary that defines routing policies, rate limits, spend limits, and observability scope. Common gateway patterns include the following: Environment separation: Create separate gateways for staging and production Team isolation: One gateway per team for budget tracking Customer multi-tenancy: One gateway per customer for isolated policies Navigate to Gateways. Click Create Gateway. Configure the gateway: Display name: Choose a descriptive name (for example, my-first-gateway) Workspace: Select a workspace (conceptually similar to a resource group) Description: Add context about this gateway’s purpose Optional metadata for documentation After creation, copy the gateway endpoint from the overview page. You’ll need this for sending requests. The gateway ID is embedded in the endpoint URL. For example: Endpoint: https://example/gateways/d633lffcc16s73ct95mg/v1 Gateway ID: d633lffcc16s73ct95mg Send your first request Now that you’ve configured a provider and created a gateway, send a test request to verify everything works. Python Node.js cURL from openai import OpenAI client = OpenAI( base_url="<your-gateway-endpoint>", api_key="<your-redpanda-api-key>", # Or use gateway's auth ) response = client.chat.completions.create( model="openai/gpt-5.2", # Use vendor/model format messages=[ {"role": "user", "content": "Hello!"} ], ) print(response.choices[0].message.content) Expected output: Hello! How can I help you today? import OpenAI from 'openai'; const client = new OpenAI({ baseURL: '<your-gateway-endpoint>', apiKey: '<your-redpanda-api-key>', // Or use gateway's auth }); const response = await client.chat.completions.create({ model: 'anthropic/claude-sonnet-4-5-20250929', // Use vendor/model format messages: [ { role: 'user', content: 'Hello!' } ], }); console.log(response.choices[0].message.content); Expected output: Hello! How can I help you today? curl <your-gateway-endpoint>/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer <your-redpanda-api-key>" \ -d '{ "model": "openai/gpt-5.2", "messages": [ {"role": "user", "content": "Hello!"} ] }' Expected output: { "id": "chatcmpl-abc123", "object": "chat.completion", "model": "openai/gpt-5.2", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 9, "completion_tokens": 9, "total_tokens": 18 } } Troubleshooting If your request fails, check these common issues: 401 Unauthorized: Verify your API key is valid 404 Not Found: Confirm the base URL matches your gateway endpoint Model not found: Ensure the model is enabled in the model catalog and that you’re using the correct vendor/model format. Verify in the gateway overview Confirm your request was routed through AI Gateway. On the Overview tab, check the aggregate metrics: Total Requests: Should have incremented Total Tokens: Shows combined input and output tokens Total Cost: Estimated spend across all requests Avg Latency: Average response time in milliseconds Scroll to the Models table to see per-model statistics: The model you used in your request should appear with its request count, token usage (input/output), estimated cost, latency, and error rate. Configure LLM routing (optional) Configure rate limits, spend limits, and provider pools with failover. On the Gateways page, select the LLM tab to configure routing policies. The LLM routing pipeline represents the request lifecycle: Rate Limit: Control request throughput (for example, 100 requests/second) Spend Limit: Set monthly budget caps (for example, $15K/month with blocking enforcement) Provider Pools: Define primary and fallback providers Configure provider pool with fallback For high availability, configure a fallback provider that activates when the primary fails: Add a second provider (for example, Anthropic). In your gateway’s LLM routing configuration: Primary pool: OpenAI (preferred for quality) Fallback pool: Anthropic (activates on rate limits, timeouts, or errors) Save the configuration. The gateway automatically routes to the fallback when it detects: Rate limit exceeded Request timeout 5xx server errors from primary provider Configure MCP tools (optional) If you’re using AI agents, configure Model Context Protocol (MCP) tool aggregation. On the Gateways page, select the MCP tab to configure tool discovery and execution. The MCP proxy aggregates multiple MCP servers behind a single endpoint, allowing agents to discover and call tools through the gateway. Configure the MCP settings: Display name: Descriptive name for the provider pool Model: Choose which model handles tool execution Load balancing: If multiple providers are available, select a strategy (for example, round robin) Available MCP tools The gateway provides these built-in MCP tools: Data catalog API: Query your data catalog Memory store: Persistent storage for agent state Vector search: Semantic search over embeddings MCP Orchestrator: Built-in tool for programmatic multi-tool workflows The MCP Orchestrator enables agents to generate JavaScript code that calls multiple tools in a single orchestrated step, reducing round trips. For example, a workflow requiring 47 file reads can be reduced from 49 round trips to just 1. To add external tools (for example, Slack, GitHub), add their MCP server endpoints to your gateway configuration. Deferred tool loading When many tools are aggregated, listing all tools upfront can consume significant tokens. With deferred tool loading, the MCP gateway initially returns only: A tool search capability The MCP Orchestrator Agents then search for specific tools they need, retrieving only that subset. This can reduce token usage by 80-90% when you have many tools configured. Configure CEL routing rule (optional) Use CEL (Common Expression Language) expressions to route requests dynamically based on headers, content, or other request properties. The AI Gateway uses CEL for flexible routing without code changes. Use CEL to: Route premium users to better models Apply different rate limits based on user tiers Enforce policies based on request content Add a routing rule In your gateway’s routing configuration: Add a CEL expression to route based on user tier: # Route based on user tier header request.headers["x-user-tier"] == "premium" ? "openai/gpt-5.2" : "openai/gpt-5.2-mini" Save the rule. The gateway editor helps you discover available request fields (headers, path, body, and so on). Test the routing rule Send requests with different headers to verify routing: Premium user request: response = client.chat.completions.create( model="openai/gpt-5.2", # Will be routed based on CEL rule messages=[{"role": "user", "content": "Hello"}], extra_headers={"x-user-tier": "premium"} ) # Should route to gpt-5.2 (premium model) Free user request: response = client.chat.completions.create( model="openai/gpt-5.2-mini", messages=[{"role": "user", "content": "Hello"}], extra_headers={"x-user-tier": "free"} ) # Should route to gpt-5.2-mini (cost-effective model) Common CEL patterns Route based on model family: request.body.model.startsWith("anthropic/") Apply a rule to all requests: true Guard for field existence: has(request.body.max_tokens) && request.body.max_tokens > 1000 For more CEL examples, see CEL Routing Cookbook. Connect AI tools to your gateway The AI Gateway provides standardized endpoints that work with various AI development tools. This section shows how to configure popular tools. MCP endpoint If you’ve configured MCP tools in your gateway, AI agents can connect to the aggregated MCP endpoint: MCP endpoint URL: <your-gateway-endpoint>/mcp Required headers: Authorization: Bearer <your-api-key> This endpoint aggregates all MCP servers configured in your gateway. Environment variables For consistent configuration, set these environment variables: export REDPANDA_GATEWAY_URL="<your-gateway-endpoint>" export REDPANDA_API_KEY="<your-api-key>" Claude Code Configure Claude Code using HTTP transport for the MCP connection: claude mcp add --transport http redpanda-aigateway <your-gateway-endpoint>/mcp \ --header "Authorization: Bearer <your-api-key>" Alternatively, edit ~/.claude/config.json: { "mcpServers": { "redpanda-ai-gateway": { "transport": "http", "url": "<your-gateway-endpoint>/mcp", "headers": { "Authorization": "Bearer <your-api-key>" } } }, "apiProviders": { "redpanda": { "baseURL": "<your-gateway-endpoint>" } } } Continue.dev Edit your Continue config file (~/.continue/config.json): { "models": [ { "title": "Redpanda AI Gateway - GPT-5.2", "provider": "openai", "model": "openai/gpt-5.2", "apiBase": "<your-gateway-endpoint>", "apiKey": "<your-api-key>" }, { "title": "Redpanda AI Gateway - Claude", "provider": "anthropic", "model": "anthropic/claude-sonnet-4.5", "apiBase": "<your-gateway-endpoint>", "apiKey": "<your-api-key>" }, { "title": "Redpanda AI Gateway - Gemini", "provider": "google", "model": "google/gemini-2.0-flash", "apiBase": "<your-gateway-endpoint>", "apiKey": "<your-api-key>" } ] } Cursor IDE Configure Cursor in Settings (Cursor → Settings or Cmd+,): { "cursor.ai.providers.openai.apiBase": "<your-gateway-endpoint>" } Custom applications For custom applications using OpenAI, Anthropic, or Google Gemini SDKs: Python with OpenAI SDK: from openai import OpenAI client = OpenAI( base_url="<your-gateway-endpoint>", api_key="<your-api-key>", ) Python with Anthropic SDK: from anthropic import Anthropic client = Anthropic( base_url="<your-gateway-endpoint>", api_key="<your-api-key>", ) Node.js with OpenAI SDK: import OpenAI from 'openai'; const openai = new OpenAI({ baseURL: '<your-gateway-endpoint>', apiKey: process.env.REDPANDA_API_KEY, }); Next steps Explore advanced AI Gateway features: CEL Routing Cookbook: Advanced CEL routing patterns for traffic distribution and cost optimization MCP Gateway: Configure MCP server aggregation and deferred tool loading Learn about the architecture: AI Gateway Architecture: Technical architecture, request lifecycle, and deployment models What is an AI Gateway?: Problems AI Gateway solves and common use cases Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback! Overview Architecture