chore(weave): Add db and server support for cached tokens by andrewtruong · Pull Request #6507 · wandb/weave

andrewtruong · 2026-03-30T18:51:11Z

https://coreweave.atlassian.net/browse/WB-32599

Relevant wiring to add cached token support to the backend

wandbot-3000 · 2026-03-30T18:55:47Z

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=68df4ca0de199e1eae21559fe58b3c0af6cb9bca

codecov · 2026-03-30T18:58:30Z

Codecov Report

❌ Patch coverage is 77.77778% with 10 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...ve/trace_server/clickhouse_trace_server_batched.py	0.00%	8 Missing ⚠️
..._server/calls_query_builder/usage_query_builder.py	0.00%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

w-b-hivemind · 2026-03-31T16:19:34Z

HiveMind Sessions

4 sessions · 25h 44m · $29

Session	Agent	Duration	Tokens	Cost	Lines
Extract a concise 3-8 word title that captures `097bd183-ceb4-45dc-be84-03a2ce5ace65`	claude	10h 11m	34.3K	$8.35	+58 -15
Conductor Workspace Setup `9cb7633e-d313-4a41-b28d-858f2b7b1150`	claude	1m	3.4K	$0.47	+11 -1
Conductor Workspace Setup `ab32679e-58dc-4907-bb2a-3b6c3ec24788`	claude	15h 31m	108.4K	$20	+762 -245
Conductor Workspace Setup `d568acd1-86fe-4588-92f4-4b7ea9c1c10c`	claude	26s	184	$0.12	+0 -0
Total		25h 44m	146.3K	$29	+831 -261

View all sessions in HiveMind →

Run claude --resume 097bd183-ceb4-45dc-be84-03a2ce5ace65 to pickup where you left off.

devin-ai-integration

Devin Review found 2 new potential issues.

View 8 additional findings in Devin Review.

devin-ai-integration · 2026-04-01T19:17:47Z

weave/trace_server/costs/insert_costs.py

🔴 filter_out_current_costs ignores cache cost fields, causing updated cache pricing to never be seeded

The filter_out_current_costs function at weave/trace_server/costs/insert_costs.py:116-150 determines whether a cost entry already exists in the DB by comparing only prompt_token_cost (mapped from cost["input"]), completion_token_cost (mapped from cost["output"]), and effective_date. It does not compare the new cache_read_input or cache_creation_input fields. Similarly, get_current_costs at weave/trace_server/costs/insert_costs.py:22-39 only queries llm_id, prompt_token_cost, completion_token_cost, effective_date — it doesn't fetch cache cost columns at all.

This means if cost_checkpoint.json is updated to add cache pricing for a model that already has matching prompt/completion costs and effective_date in the database, the entry will be incorrectly filtered out as a duplicate, and the new cache costs will never be inserted.

(Refers to lines 130-143)

Prompt for agents

In weave/trace_server/costs/insert_costs.py, update get_current_costs (lines 22-39) to also SELECT cache_read_input_token_cost and cache_creation_input_token_cost from llm_token_prices. Then update filter_out_current_costs (lines 116-150) to unpack those additional columns from the current_costs tuples and include them in the comparison at lines 132-135. Add two additional math.isclose checks: one comparing cache_read_input_token_cost with cost.get('cache_read_input', 0) and another comparing cache_creation_input_token_cost with cost.get('cache_creation_input', 0).

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-01T19:17:48Z

weave/trace_server/trace_server_interface.py

 Cost metrics are computed post-query by multiplying token counts by prices from llm_token_prices:
 - input_cost: input_tokens * prompt_token_cost


🟡 total_cost docstring not updated to reflect inclusion of cache costs

The UsageMetric docstring at weave/trace_server/trace_server_interface.py:3063-3071 states total_cost: input_cost + output_cost, but the actual implementation in _compute_costs_for_buckets (weave/trace_server/clickhouse_trace_server_batched.py:1137-1146) now computes total_cost = input_cost + output_cost + cache_read_total + cache_creation_total. Users relying on the documented formula will have incorrect expectations about what total_cost includes.

(Refers to lines 3067-3071)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration

Devin Review found 2 new potential issues.

View 10 additional findings in Devin Review.

devin-ai-integration · 2026-04-01T20:24:49Z

weave/trace_server/token_costs.py

-            '"prompt_tokens_total_cost":', toString(prompt_tokens * prompt_token_cost), ',',
+            '"cache_read_input_token_cost":', toString(cache_read_input_token_cost), ',',
+            '"cache_creation_input_token_cost":', toString(cache_creation_input_token_cost), ',',
+            '"prompt_tokens_total_cost":', toString((prompt_tokens - cache_read_input_tokens) * prompt_token_cost), ',',


🔴 prompt_tokens_total_cost double-charges cache_creation_input_tokens in ClickHouse SQL

The prompt_tokens_total_cost formula subtracts cache_read_input_tokens from prompt_tokens but does not subtract cache_creation_input_tokens. Since providers like Anthropic include both cache-read and cache-creation tokens in the total input_tokens count, cache_creation_input_tokens are double-charged: once at the regular prompt rate (included in prompt_tokens_total_cost) and again at the cache-creation rate (in cache_creation_input_tokens_total_cost). The comment in the SQLite path (sqlite_trace_server.py:875-876) confirms the intent: "Subtract cached tokens: they are billed at the cache rate, not the regular input rate" — but only one of the two cache token types is subtracted.

Suggested change

'"prompt_tokens_total_cost":', toString((prompt_tokens - cache_read_input_tokens) * prompt_token_cost), ',',

'"prompt_tokens_total_cost":', toString((prompt_tokens - cache_read_input_tokens - cache_creation_input_tokens) * prompt_token_cost), ',',

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-01T20:24:50Z

weave/trace_server/sqlite_trace_server.py

+                    "prompt_tokens_total_cost": (
+                        prompt_tokens - cache_read_input_tokens
+                    )
+                    * prompt_cost,


🔴 prompt_tokens_total_cost double-charges cache_creation_input_tokens in SQLite path

Same issue as the ClickHouse SQL path: the SQLite cost calculation at sqlite_trace_server.py:877-880 computes prompt_tokens_total_cost as (prompt_tokens - cache_read_input_tokens) * prompt_cost, but fails to also subtract cache_creation_input_tokens. This causes cache-creation tokens to be billed at both the regular prompt rate and the cache-creation rate.

Suggested change

"prompt_tokens_total_cost": (

prompt_tokens - cache_read_input_tokens

)

* prompt_cost,

"prompt_tokens_total_cost": (

prompt_tokens

- cache_read_input_tokens

- cache_creation_input_tokens

)

* prompt_cost,

Was this helpful? React with 👍 or 👎 to provide feedback.

test

d46ab33

test

949433d

andrewtruong added 2 commits March 31, 2026 12:51

test

6646908

test

7ab9937

andrewtruong changed the title ~~chore(weave): Update schema to support cached tokens~~ chore(weave): Add db and server support for cached tokens Mar 31, 2026

andrewtruong added 7 commits March 31, 2026 13:32

test

26c8310

test

4ae6671

test

ec2f3af

test

339f626

test

62c9ca5

test

2861a9c

Merge branch 'master' into andrew/cached-token-support

d5b6d31

andrewtruong marked this pull request as ready for review April 1, 2026 16:58

andrewtruong requested review from a team, gtarpenning and tssweeney as code owners April 1, 2026 16:58

This comment was marked as resolved.

Sign in to view

test

b9a5a23

devin-ai-integration bot reviewed Apr 1, 2026

View reviewed changes

test

a2e68b4

devin-ai-integration bot reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(weave): Add db and server support for cached tokens#6507

chore(weave): Add db and server support for cached tokens#6507
andrewtruong wants to merge 13 commits intomasterfrom
andrew/cached-token-support

andrewtruong commented Mar 30, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

wandbot-3000 bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

w-b-hivemind bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Uh oh!

devin-ai-integration bot Apr 1, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Uh oh!

devin-ai-integration bot Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		Cost metrics are computed post-query by multiplying token counts by prices from llm_token_prices:
		- input_cost: input_tokens * prompt_token_cost

	'"prompt_tokens_total_cost":', toString((prompt_tokens - cache_read_input_tokens) * prompt_token_cost), ',',
	'"prompt_tokens_total_cost":', toString((prompt_tokens - cache_read_input_tokens - cache_creation_input_tokens) * prompt_token_cost), ',',

Conversation

andrewtruong commented Mar 30, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wandbot-3000 bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

w-b-hivemind bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

HiveMind Sessions

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

andrewtruong commented Mar 30, 2026 •

edited by devin-ai-integration bot

Loading

wandbot-3000 bot commented Mar 30, 2026 •

edited

Loading

codecov bot commented Mar 30, 2026 •

edited

Loading

w-b-hivemind bot commented Mar 31, 2026 •

edited

Loading