Agentic Data Plane

Monitor Agent Activity

Use monitoring to track agent performance, analyze conversation patterns, debug execution issues, and optimize token costs.

After reading this page, you will be able to:

Verify agent behavior using the Inspector tab
Track token usage and performance metrics
Debug agent execution using Transcripts

For conceptual background on traces and observability, see How Observability Works.

Prerequisites

You must have a running agent. If you do not have one, see Agentic Data Plane Quickstart for Agent Builders.

Debug agent execution with Transcripts

An agent’s Transcripts tab shows each conversation with timing, errors, and token usage. Use it to debug issues, verify agent behavior, and monitor performance.

Navigate an agent’s transcripts

Open Agents in the sidebar and select your agent.
Open the Transcripts tab.

The tab lists the agent’s recent conversations, one row per conversation:

Conversation: The conversation ID, with the conversation title when one exists.
Started: When the conversation began.
Duration: End-to-end wall-clock time.
Turns: Number of turns in the conversation.
Status: Completed, Error, or Running.
Tokens: Total tokens across the conversation.

Use the search box to match a conversation ID or title, and the status dropdown to narrow the list to Completed, Error, or Running conversations. The list loads in pages; click Load more to fetch older conversations.

Conversation detail

Click a row to open the conversation. The header shows the conversation ID with a status badge, the start time, duration, and turn count, plus a total-tokens chip. Toggle between two views:

Chat: The user-visible exchange, as a conversation.
Detailed: Adds per-turn metadata: latency, input/output token splits, and each tool call with its arguments, result, latency, and status.

If any turns were rebuilt from LLM message context after their original spans were evicted, those turns carry a reconstructed marker. For the mechanics, see Reconstructed transcript history.

Check agent health

Use the Transcripts tab to verify your agent is healthy. Recent conversations should show Completed status, duration within your expected range, and stable token usage without unexpected growth.

Several warning signs indicate problems. Conversations with Error status need investigation. When duration increases over time, your context window may be growing or tool calls could be slowing down. Many LLM calls for simple requests often signal that the agent is stuck in loops or making unnecessary iterations. If you see no new transcripts, the agent may be stopped or encountering deployment issues.

Pay attention to patterns across multiple conversations. When all recent transcripts show errors, start by checking agent status, MCP server connectivity, and system prompt configuration. A list that alternates between success and error typically points to intermittent tool failures or external API issues. If duration increases steadily over a session, your context window is likely filling up. Clear the conversation history to reset it. High token usage combined with relatively few LLM calls usually means tool results are large or your system prompts are verbose.

Debug with Transcripts

Use the Transcripts tab to diagnose specific issues:

If the agent is not responding:

Check the list for recent conversations. If none appear, the agent may be stopped.
Verify agent status in the main Agents view.
Look for error transcripts with deployment or initialization failures.

If the agent fails during execution:

Set the status dropdown to Error and open the failed conversation.
Switch to the Detailed view and find the turn carrying the error.
Check the tool call’s arguments and result for error messages.
Cross-reference with MCP server status.

If performance is slow:

Compare the Duration column across recent conversations.
Open a slow conversation in the Detailed view and scan per-turn latency to find the bottleneck.
Check if LLM calls are taking longer than expected.
Verify tool execution time on the nested tool calls.

Track token usage and costs

View token consumption in the conversation detail view. The Detailed view breaks each turn into input tokens (everything sent to the LLM including system prompt, conversation history, and tool results) and output tokens (what the LLM generates in agent responses); the header chip shows the conversation total.

Calculate cost per request:

Cost = (input_tokens x input_price) + (output_tokens x output_price)

Example: GPT-5.2 with 4,302 input tokens and 1,340 output tokens at $0.00000175 per input token and $0.000014 per output token costs $0.026 per request.

For cost optimization strategies, see Cost calculation.

Test agent behavior with Inspector

The Inspector tab provides real-time conversation testing. Use it to test agent responses interactively and verify behavior before deploying changes.

Access Inspector

Open Agents in the sidebar.
Click your agent name.
Open the Inspector tab.
Enter test queries and review responses.
Check the conversation panel to see tool calls.
Start a new session to test fresh conversations or click Clear context to reset history.

Context-window usage

As you test, the composer tracks how much of the model’s context window the conversation consumes. When the agent uses a model from Redpanda’s catalog, a context-usage indicator appears in the composer after the first response and shows the share of the context window in use. Click it to open a breakdown of the window size, the input, output, reasoning, and cached tokens for the session, the Session total, and the estimated Total cost.

The indicator counts input tokens plus cached tokens, so it reflects the full context occupancy under prompt caching, not fresh input alone. As a conversation grows and the indicator approaches the window size, start a new session or click Clear context to reset the history before responses slow down or the agent starts dropping earlier context.

For models that aren’t in the catalog, such as OpenAI-compatible providers or hand-entered model IDs, the composer shows a plain input and output token count instead, because the context-window size isn’t known.

Long-running tasks

The Inspector live view streams a test run as it happens. If a task is still running after about five minutes, Inspector stops the live view and shows a Still running notice: the task keeps running in the background, so you don’t need to hold the Inspector open. Follow it to completion on the Activity tab of the Cost & Usage page under Governance.

Testing best practices

Test your agents systematically by exploring edge cases and potential failure scenarios. Begin with boundary testing. Requests at the edge of agent capabilities verify that scope enforcement works correctly. Error handling becomes clear when you request unavailable data and observe whether the agent degrades gracefully. Even with proper system prompt constraints, testing confirms that your agent responds appropriately to edge cases.

Monitor iteration counts during complex requests to ensure they complete within your configured limits. Ambiguous or vague queries reveal whether the agent asks clarifying questions or makes risky assumptions. Throughout testing, track token usage per request to estimate costs and identify which query patterns consume the most resources.

Next steps

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution