fix: export thoughts_token_count to OpenTelemetry trace spans by brucearctor · Pull Request #4835 · google/adk-python

brucearctor · 2026-03-14T01:20:20Z

Description

ADK's OpenTelemetry tracing does not export thoughts_token_count to span attributes. When using Gemini models with ThinkingConfig, the usage_metadata in LlmResponse correctly contains thoughts_token_count, but this field is never written to spans by trace_generate_content_result() or trace_inference_result().

Interestingly, trace_call_llm() already exports this field (as gen_ai.usage.experimental.reasoning_tokens). This PR adds the same export to the two remaining functions that were missing it.

Changes

`src/google/adk/telemetry/tracing.py`

Added thoughts_token_count → gen_ai.usage.experimental.reasoning_tokens span attribute export in trace_generate_content_result() (~line 746)
Added the same export in trace_inference_result() (~line 789)
Uses the same try/except AttributeError guard pattern as trace_call_llm() for backward compatibility with older SDK versions

`tests/unittests/telemetry/test_spans.py`

Added test_trace_inference_result_with_thinking_tokens — verifies the attribute is exported when thoughts_token_count is non-None
Added test_trace_inference_result_without_thinking_tokens — verifies no attribute is set when thoughts_token_count is None

Testing Plan

Unit Tests

All 23 telemetry tests pass:

$ pytest tests/unittests/telemetry/test_spans.py -v
23 passed in 1.08s

New tests specifically verify:

thoughts_token_count=50 → span attribute gen_ai.usage.experimental.reasoning_tokens=50 is set
thoughts_token_count=None → no gen_ai.usage.experimental.reasoning_tokens attribute on span

Verification

Before fix — Event.usage_metadata.thoughts_token_count is non-zero but Cloud Trace spans only show gen_ai.usage.input_tokens and gen_ai.usage.output_tokens.

After fix — gen_ai.usage.experimental.reasoning_tokens appears alongside the existing token attributes in all three tracing functions.

gemini-code-assist · 2026-03-14T01:20:24Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

rohityan · 2026-03-17T23:22:13Z

Hi @brucearctor, Thank you for your contribution! We appreciate you taking the time to submit this pull request. Please fix the mypy-diff errors.

brucearctor · 2026-03-17T23:23:02Z

@rohityan -- will do

brucearctor · 2026-03-17T23:48:03Z

@rohityan going to let it run, but I think addressed the diff/new error, and solved a couple others :-)

Let me know if other concerns. Cheers -

rohityan · 2026-05-12T02:39:59Z

Hi @brucearctor , can you please resolve branch conflicts.

Add thoughts_token_count as gen_ai.usage.experimental.reasoning_tokens span attribute in trace_generate_content_result() and trace_inference_result(), matching the existing pattern in trace_call_llm(). Fixes google#4829

- Refactor trace_inference_result() to use otel_span local variable, eliminating all 4 union-attr mypy errors (1 new + 3 pre-existing) - Add tests for trace_generate_content_result() thinking tokens - Import trace_generate_content_result in test_spans.py

brucearctor · 2026-05-12T03:05:49Z

@rohityan Rebased onto latest main and resolved the conflict in tests/unittests/telemetry/test_spans.py (upstream added _safe_json_serialize circular dict tests at the same location as the thinking token tests — kept both). Ready for CI. 👍

wojcikm · 2026-06-02T13:06:28Z

Does the scope of this issue (and PR #4835) also cover the BigQueryAgentAnalyticsPlugin? The BQ plugin has a separate code path from tracing.py and currently does not write thoughts_token_count to the BigQuery table either.

We have LookML dashboards consuming agent_events from BQ and our cost formulas are ready to include thinking tokens (via COALESCE on usage_metadata.thoughts_token_count), but the column stays NULL because the BQ plugin never extracts it.

If BQ plugin is out of scope for current one, happy to open a separate issue.

brucearctor · 2026-06-03T01:55:15Z

I went ahead and got conflicts resolved again.

@wojcikm : I do not mind doing, but this has also been open for quite awhile, and would be good to close based on my original understanding. Happy to address [ or for someone else to ], if we want to include BigQueryAgentAnalyticsPlugin as in-scope.

But, looks like we need @rohityan , @jawoszek as assigned reviewer or other to take a look. Not sure who is in charge of determining scope.

brucearctor · 2026-06-03T01:56:25Z

ah ... actually pushing -->

Resolve conflict in tests/unittests/telemetry/test_spans.py by keeping both our thinking token tests and upstream's new tests for error detection, error_type parameter, and extra generate content attributes.

adk-bot · 2026-06-08T03:02:27Z

🔍 ADK Pull Request Analysis: PR #4835

Title: fix: export thoughts_token_count to OpenTelemetry trace spans
Author: @brucearctor
Status: open
Impact: 128 additions, 5 deletions across 2 changed files

Executive Summary

Core Objective: Add OpenTelemetry span attribute export for reasoning/thinking token counts (thoughts_token_count) under 'gen_ai.usage.experimental.reasoning_tokens' in trace_generate_content_result() and trace_inference_result().
Justification & Value: Justified Fix - Fills a critical observability gap where thinking token usage is correctly captured in usage_metadata for Gemini 2.0+ models but left unexported in two of the major model telemetry spans.
Alignment with Principles: Pass - Implementation is highly decoupled, maintains clean typing, avoids breaking modifications, and handles backwards safety via targeted AttributeError exception catching.
Recommendation: Approve - The changes are clean, address the stated issues perfectly, clear compilation errors, and provide exhaustive test coverage.

Detailed Findings & Analysis

1. Objectives & Impact ("What does it do?")

Context & Background: Tracing configurations for Gemini models utilizing reasoning/thinking configurations (e.g. ThinkingConfig) output thoughts_token_count metrics within LlmResponse.usage_metadata. Initially, only trace_call_llm exported this field correctly. Linked Issue #4829 highlighted that telemetry spans generated via trace_generate_content_result() or trace_inference_result() lacked the corresponding gen_ai.usage.experimental.reasoning_tokens attribute.
Implementation Mechanism:
- Exposes thoughts_token_count from metadata and pushes it to OpenTelemetry trace spans utilizing the 'gen_ai.usage.experimental.reasoning_tokens' attribute block.
- Implements an AttributeError try-except handler wrapper to gracefully handle older GenAI SDK environments where thoughts_token_count may not be configured.
- Prevents type-narrowing static compilation issues during testing or lint checks in trace_inference_result by renaming the in-place parameter reference span to otel_span. This elegantly resolves strict mypy issues.
Affected Surface: Telemetry traces generated via the GeneratorContentSpan flow. There is no public API breaking change.

2. Justification & Value ("Is it a valid and useful change?")

Workspace Verification:
- Investigated tracing.py: verified that while trace_call_llm indeed has the logic to retrieve and record experimental reasoning tokens, trace_generate_content_result and trace_inference_result were completely omitting it.
- Verification confirms that the issue reported in Issue #4829 represents a genuine bug that limits cost calculation, trace dashboards, and token monitoring pipelines.
Value Assessment: Highly valuable. Tracking thinking token consumption is essential for developers using reasoning models (like gemini-2.5-flash) to trace operational costs and analyze token usage spikes within cloud aggregators like Cloud Trace.
Alternative Approaches: No cleaner alternative structure exists. The PR follows exact architectural patterns of other token parameters. Using a string literal for the experimental reasoning key is consistent with standard practices since Opentelemetry's incubating attributes do not define reasoning token conventions in public stables yet.
Scope & Depth: Symptom / Systematic Fix
- This is a systematic fix for trace-based spans.
- Recommendation Note: As pointed out by community contributors, other telemetry subsystems (specifically bigquery_agent_analytics_plugin.py) do not extract or record thoughts_token_count values into structured tables. While the current PR fully addresses the trace span scope, a subsequent task or issue should be raised to update analytics plugins.

3. Principle & Style Alignment Checklist ("Does it follow rules?")

Public API & Visibility Boundaries:
- Status: Pass
- Analysis: No changes made to public method signatures or namespaces. Standard structures and parameters are fully preserved with backward compatibility.
Code Quality, Typing & Conventions:
- Status: Pass
- Analysis: Complies with from __future__ import annotations styling. The type-hint error introduced by in-place variable rebinding was perfectly addressed by moving the parameter span into the localized otel_span typing variable.
Robustness & Edge Cases:
- Status: Pass
- Analysis: Robust boundary/null checks prevent crashing when usage_metadata or individual token counters are absent.
Test Integrity & Quality:
- Status: Pass
- Analysis: Four new comprehensive unit test functions are added within test_spans.py focusing on both active thinking tokens versus cases where token values are None. Tests conform strictly to standard mock assertions and follow the structured AAA pattern.

Phase Summary & Suggested Action

I recommend merging this patch to resolve OpenTelemetry reasoning-token exporting gaps. To address community concerns raised in peer reviews, we should additionally track the BigQueryAgentAnalyticsPlugin integration in a separate enhancement issue.

boyangsvl · 2026-06-16T21:43:24Z

thought token is supported in the newer version of ADK: src/google/adk/telemetry/_token_usage.py
Closing this PR as it's using the old experimental key gen_ai.usage.experimental.reasoning_tokens instead of the current standard: gen_ai.usage.reasoning.output_tokens

brucearctor · 2026-06-16T21:45:33Z

So #4829 is closed [ or should be ]?

What's the PR that closed it? should that get linked to the issue?

boyangsvl · 2026-06-16T21:51:02Z

It's addressed internally so there's no PR associated with it. The code is here: https://github.com/google/adk-python/blob/main/src/google/adk/telemetry/_token_usage.py#L34

brucearctor · 2026-06-16T22:18:03Z

Looks like this PR: #6022 ?

boyangsvl · 2026-06-16T22:37:42Z

Thanks! I've added it to the original issue.

adk-bot added the tracing [Component] This issue is related to OpenTelemetry tracing label Mar 14, 2026

brucearctor mentioned this pull request Mar 14, 2026

Thinking tokens not in traces #4829

Closed

rohityan self-assigned this Mar 17, 2026

rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label Mar 17, 2026

jawoszek approved these changes Apr 17, 2026

View reviewed changes

brucearctor added 2 commits May 11, 2026 20:04

fix: export thoughts_token_count to OpenTelemetry trace spans

ab3f86d

Add thoughts_token_count as gen_ai.usage.experimental.reasoning_tokens span attribute in trace_generate_content_result() and trace_inference_result(), matching the existing pattern in trace_call_llm(). Fixes google#4829

brucearctor force-pushed the fix/thinking-tokens-in-traces branch from bfb9457 to 3a2cef7 Compare May 12, 2026 03:04

brucearctor added 2 commits June 2, 2026 18:59

Merge upstream main into fix/thinking-tokens-in-traces

4cb57b9

Resolve conflict in tests/unittests/telemetry/test_spans.py by keeping both our thinking token tests and upstream's new tests for error detection, error_type parameter, and extra generate content attributes.

Merge branch 'main' into fix/thinking-tokens-in-traces

53472d0

boyangsvl assigned boyangsvl and unassigned rohityan Jun 16, 2026

boyangsvl closed this Jun 16, 2026

Conversation

brucearctor commented Mar 14, 2026

Description

Changes

src/google/adk/telemetry/tracing.py

tests/unittests/telemetry/test_spans.py

Testing Plan

Unit Tests

Verification

Uh oh!

gemini-code-assist Bot commented Mar 14, 2026

Uh oh!

rohityan commented Mar 17, 2026

Uh oh!

brucearctor commented Mar 17, 2026

Uh oh!

brucearctor commented Mar 17, 2026

Uh oh!

rohityan commented May 12, 2026

Uh oh!

brucearctor commented May 12, 2026

Uh oh!

wojcikm commented Jun 2, 2026

Uh oh!

brucearctor commented Jun 3, 2026

Uh oh!

brucearctor commented Jun 3, 2026

Uh oh!

adk-bot commented Jun 8, 2026

🔍 ADK Pull Request Analysis: PR #4835

Executive Summary

1. Objectives & Impact ("What does it do?")

2. Justification & Value ("Is it a valid and useful change?")

3. Principle & Style Alignment Checklist ("Does it follow rules?")

Phase Summary & Suggested Action

Uh oh!

boyangsvl commented Jun 16, 2026

Uh oh!

brucearctor commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boyangsvl commented Jun 16, 2026

Uh oh!

brucearctor commented Jun 16, 2026

Uh oh!

boyangsvl commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

`src/google/adk/telemetry/tracing.py`

`tests/unittests/telemetry/test_spans.py`

brucearctor commented Jun 16, 2026 •

edited

Loading