Skip to content

Fix token count mismatch in Agents dashboard for nested LangGraph setups#195

Open
hectorhdzg wants to merge 1 commit into
microsoft:mainfrom
hectorhdzg:hectorhdzg/nestedagent
Open

Fix token count mismatch in Agents dashboard for nested LangGraph setups#195
hectorhdzg wants to merge 1 commit into
microsoft:mainfrom
hectorhdzg:hectorhdzg/nestedagent

Conversation

@hectorhdzg

Copy link
Copy Markdown
Member

Problem

In multi-agent LangGraph setups (e.g., graphs with sub-graphs), the Agents dashboard in Azure Monitor shows significantly lower token counts than what appears in the trace spans. For example, a trace showing 6,816 total tokens would only display ~2.2K in the Token Consumption chart.

Root Cause

_is_agent_run() only checked the direct parent when preventing nested agent detection:

if run.parent_run_id and run.parent_run_id in self._agent_run_ids:

In LangGraph, agent-like sub-graphs are often separated from the top-level agent by intermediate chain nodes:

TopAgent (agent) -> node_step (chain) -> SubGraph (wrongly detected as agent) -> LLM

Since SubGraph's direct parent is node_step (not in _agent_run_ids), it was incorrectly treated as a separate agent. LLM tokens under it aggregated to SubGraph's bucket instead of the top-level agent. When SubGraph ended, its tokens were discarded -- agent runs don't roll up to parent agents.

Fix

Changed the direct-parent check to walk the full ancestor chain using the existing _find_agent_ancestor() method:

if self._find_agent_ancestor(run) is not None:
    return False

This ensures any agent-like chain nested anywhere under an existing agent is treated as internal, and all LLM token counts aggregate to the single top-level invoke_agent wrapper span.

Copilot AI review requested due to automatic review settings June 8, 2026 20:43
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Performance comparison

Threshold: regressions >15.0% on gating scenarios fail the build. Higher ops/s is better; positive Δ means the PR is slower.

Scenario Gating Baseline (ops/s) Candidate (ops/s) Δ % Status
azure_monitor_log yes 45,531.1 45,324.8 +0.46%
azure_monitor_span yes 209,095.7 209,753.5 -0.31%
otel_log no 55,132.9 55,163.3 -0.06%
otel_span no 56,652.4 56,315.8 +0.60%

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect token aggregation in the Azure Monitor Agents dashboard for multi-agent / nested LangGraph setups by ensuring agent detection treats any agent-like chain under an existing agent (even when separated by intermediate non-agent chain nodes) as internal rather than a new agent.

Changes:

  • Update _is_agent_run() to detect nested agents by walking the full ancestor chain via _find_agent_ancestor() instead of only checking the direct parent.
  • Add a regression unit test covering a deeply nested “agent-like” sub-graph separated from the top-level agent by an intermediate non-agent chain.
  • Minor test formatting cleanup in an existing message/tool-call test case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/microsoft/opentelemetry/_genai/_langchain/_tracer.py Prevents incorrectly classifying nested agent-like sub-graphs as separate agents by checking all ancestors for an existing agent.
tests/langchain/test_tracer.py Adds a regression test ensuring deeply nested agent-like chains under an existing agent are not treated as new agents.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants