docs: add prompt caching FAQ (RTK does not break cache)#560
Open
FlorianBruniaux wants to merge 2 commits intodevelopfrom
Open
docs: add prompt caching FAQ (RTK does not break cache)#560FlorianBruniaux wants to merge 2 commits intodevelopfrom
FlorianBruniaux wants to merge 2 commits intodevelopfrom
Conversation
Users (e.g., Mathieu GRENIER) asked whether RTK breaks Claude's prompt cache and makes input 10x more expensive. Short answer: no. This commit documents why, with a table, a turn-by-turn example, and cost math. - Add docs/PROMPT_CACHING.md: canonical reference with component table, cache turn-by-turn walkthrough, cost comparison (118k vs 23.9k tokens), and rtk gain as self-serve proof point - Add TROUBLESHOOTING.md section at top: Problem/Short Answer/Details following existing pattern, links to full doc - Add README.md FAQ blockquote after "How It Works": one-liner answer with link for anyone scanning Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
Matches main.rs which already had 64 modules. Fixes pre-push validation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
There was a problem hiding this comment.
Pull request overview
Adds a dedicated FAQ-style explanation to address whether RTK interferes with Claude prompt caching, and cross-links it from the main docs and README so the answer is easy to find.
Changes:
- Added a new canonical doc (
docs/PROMPT_CACHING.md) explaining what RTK changes, how caching works turn-by-turn, and how to verify viartk gain. - Added a new top troubleshooting entry linking to the full prompt-caching explanation.
- Added a one-line README FAQ blurb near “How It Works”, plus corrected the architecture module total.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| docs/TROUBLESHOOTING.md | Adds a new top “Problem/Short Answer/Details” entry and links to the full prompt caching doc. |
| docs/PROMPT_CACHING.md | New standalone explanation of prompt caching behavior and RTK’s impact, with examples and a cost comparison. |
| README.md | Adds a prominent FAQ blockquote linking to the new prompt caching doc. |
| ARCHITECTURE.md | Updates the module count total line. |
Comments suppressed due to low confidence (1)
ARCHITECTURE.md:301
- The updated total (64 = 38 command + 26 infrastructure) conflicts with the breakdown immediately below (Command Modules: 34, Infrastructure Modules: 20). Please reconcile these counts so the section is internally consistent.
**Total: 64 modules** (38 command modules + 26 infrastructure modules)
### Module Count Breakdown
- **Command Modules**: 34 (directly exposed to users)
- **Infrastructure Modules**: 20 (utils, filter, tracking, tee, config, init, gain, toml_filter, verify_cmd, etc.)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| → Cache WRITE on turn2 result (1.25x, one time) | ||
| ``` | ||
|
|
||
| The key point: RTK filters `cargo test` output from 25,000 tokens to 2,500 tokens. That result goes into history at 2,500 tokens, gets written to cache once (1.25x on 2,500), and read from cache every subsequent turn (0.1x on 2,500). Without RTK, the same slot costs 12.5x more to write and 10x more to read. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes a recurring question (raised publicly by Mathieu GRENIER): does RTK break Claude's prompt cache and make input 10x more expensive?
docs/PROMPT_CACHING.md: canonical reference with a component table (what RTK touches vs what it doesn't), a turn-by-turn cache example, cost comparison based on the README's 30-min session numbers, andrtk gainas self-serve proofdocs/TROUBLESHOOTING.md: Problem/Short Answer/Details pattern, links to full docREADME.mdafter "How It Works": one-liner answer with link for scannersARCHITECTURE.mdmodule count (60 → 64) to pass pre-push validationTest plan
docs/PROMPT_CACHING.mdrenders correctly on GitHub (tables, code blocks)docs/TROUBLESHOOTING.mdnew section appears before the first existing problemREADME.mdblockquote renders as expected🤖 Generated with Claude Code