Skip to content

fix(anthropic): report real token usage on blocked responses#3

Closed
seph-barker wants to merge 1 commit into
mainfrom
joseph/fix-blocked-token-counts-main
Closed

fix(anthropic): report real token usage on blocked responses#3
seph-barker wants to merge 1 commit into
mainfrom
joseph/fix-blocked-token-counts-main

Conversation

@seph-barker

Copy link
Copy Markdown
Collaborator

The ModifyResponseException handler in the /v1/messages endpoint synthesizes a "blocked" response reporting zero input and output tokens, even though the request consumed real input tokens and the synthetic block message carries real content. Callers relying on usage (billing, quotas, metrics) under-count every blocked response.

Compute input_tokens from the original request messages (carried on the exception's request_data) and output_tokens from the block message text via litellm.token_counter. Counting is best-effort and falls back to zero on failure so a blocked response is always returned. The streaming synthesis path reuses the same response object, so both paths are fixed by one change.

Adds tests asserting nonzero, correct counts and graceful fallback. The endpoint's test file passes (5 tests).

The ModifyResponseException handler in the /v1/messages endpoint
synthesizes a "blocked" response with hardcoded usage of zero input
and output tokens, despite the request having consumed real input
tokens and the block message carrying real content.

Compute input_tokens from the original request messages (carried on
the exception's request_data) and output_tokens from the block message
text via litellm.token_counter. Token counting is best-effort and falls
back to zero on failure so a blocked response is always returned. The
streaming synthesis path reuses the same response object, so both paths
are fixed consistently.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@seph-barker

Copy link
Copy Markdown
Collaborator Author

Superseded by upstream BerriAI#31217 — opening directly against the main litellm repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant