Monitor Agent Activity
Use monitoring to track agent performance, analyze conversation patterns, debug execution issues, and optimize token costs.
After reading this page, you will be able to:
-
Verify agent behavior using the Inspector tab
-
Track token usage and performance metrics
-
Debug agent execution using Transcripts
For conceptual background on traces and observability, see How Observability Works.
Prerequisites
You must have a running agent. If you do not have one, see Redpanda ADP Quickstart.
Debug agent execution with Transcripts
An agent’s Transcripts tab shows each conversation with timing, errors, and token usage. Use it to debug issues, verify agent behavior, and monitor performance.
Navigate an agent’s transcripts
-
Open Agents in the sidebar and select your agent.
-
Open the Transcripts tab.
The tab lists the agent’s recent conversations, one row per conversation:
-
Conversation: The conversation ID, with the conversation title when one exists. -
Started: When the conversation began. -
Duration: End-to-end wall-clock time. -
Turns: Number of turns in the conversation. -
Status:Completed,Error, orRunning. -
Tokens: Total tokens across the conversation.
Use the search box to match a conversation ID or title, and the status dropdown to narrow the list to Completed, Error, or Running conversations. The list loads in pages; click Load more to fetch older conversations.
Conversation detail
Click a row to open the conversation. The header shows the conversation ID with a status badge, the start time, duration, and turn count, plus a total-tokens chip. Toggle between two views:
-
Chat: The user-visible exchange, as a conversation.
-
Detailed: Adds per-turn metadata: latency, input/output token splits, and each tool call with its arguments, result, latency, and status.
If any turns were rebuilt from LLM message context after their original spans were evicted, those turns carry a reconstructed marker. For the mechanics, see Reconstructed transcript history.
Check agent health
Use the Transcripts tab to verify your agent is healthy. Recent conversations should show Completed status, duration within your expected range, and stable token usage without unexpected growth.
Several warning signs indicate problems. Conversations with Error status need investigation. When duration increases over time, your context window may be growing or tool calls could be slowing down. Many LLM calls for simple requests often signal that the agent is stuck in loops or making unnecessary iterations. If you see no new transcripts, the agent may be stopped or encountering deployment issues.
Pay attention to patterns across multiple conversations. When all recent transcripts show errors, start by checking agent status, MCP server connectivity, and system prompt configuration. A list that alternates between success and error typically points to intermittent tool failures or external API issues. If duration increases steadily over a session, your context window is likely filling up. Clear the conversation history to reset it. High token usage combined with relatively few LLM calls usually means tool results are large or your system prompts are verbose.
Debug with Transcripts
Use the Transcripts tab to diagnose specific issues:
If the agent is not responding:
-
Check the list for recent conversations. If none appear, the agent may be stopped.
-
Verify agent status in the main Agents view.
-
Look for error transcripts with deployment or initialization failures.
If the agent fails during execution:
-
Set the status dropdown to
Errorand open the failed conversation. -
Switch to the Detailed view and find the turn carrying the error.
-
Check the tool call’s arguments and result for error messages.
-
Cross-reference with MCP server status.
If performance is slow:
-
Compare the
Durationcolumn across recent conversations. -
Open a slow conversation in the Detailed view and scan per-turn latency to find the bottleneck.
-
Check if LLM calls are taking longer than expected.
-
Verify tool execution time on the nested tool calls.
Track token usage and costs
View token consumption in the conversation detail view. The Detailed view breaks each turn into input tokens (everything sent to the LLM including system prompt, conversation history, and tool results) and output tokens (what the LLM generates in agent responses); the header chip shows the conversation total.
Calculate cost per request:
Cost = (input_tokens x input_price) + (output_tokens x output_price)
Example: GPT-5.2 with 4,302 input tokens and 1,340 output tokens at $0.00000175 per input token and $0.000014 per output token costs $0.026 per request.
For cost optimization strategies, see Cost calculation.
Test agent behavior with Inspector
The Inspector tab provides real-time conversation testing. Use it to test agent responses interactively and verify behavior before deploying changes.
Access Inspector
-
Open Agents in the sidebar.
-
Click your agent name.
-
Open the Inspector tab.
-
Enter test queries and review responses.
-
Check the conversation panel to see tool calls.
-
Start a new session to test fresh conversations or click Clear context to reset history.
Testing best practices
Test your agents systematically by exploring edge cases and potential failure scenarios. Begin with boundary testing. Requests at the edge of agent capabilities verify that scope enforcement works correctly. Error handling becomes clear when you request unavailable data and observe whether the agent degrades gracefully. Even with proper system prompt constraints, testing confirms that your agent responds appropriately to edge cases.
Monitor iteration counts during complex requests to ensure they complete within your configured limits. Ambiguous or vague queries reveal whether the agent asks clarifying questions or makes risky assumptions. Throughout testing, track token usage per request to estimate costs and identify which query patterns consume the most resources.