fix: move context files after cache boundary — 0% → ~80% cache hit rate#4
Open
daniel-rudaev wants to merge 6 commits intomainfrom
Open
fix: move context files after cache boundary — 0% → ~80% cache hit rate#4daniel-rudaev wants to merge 6 commits intomainfrom
daniel-rudaev wants to merge 6 commits intomainfrom
Conversation
Applies source-level fixes from upstream PRs #57094 and #57076 (both unmerged) via a build-time patch script. Root causes and fixes: 1. runtime-system.ts: plugin runtime API stripped heartbeat.model when forwarding to runHeartbeatOnceInternal — now passes it through 2. live-model-switch.ts: resolveLiveSessionModelSelection ignored caller- provided defaults when agentId was present, always using config default — now prefers caller defaults (the resolved heartbeat model) 3. model-fallback.ts: LiveSessionModelSwitchError was swallowed as a candidate failure, inverting the fallback order — now rethrown when rethrowLiveSwitch is set 4. get-reply.ts: post-directive resolution unconditionally overwrote the heartbeat model — now guarded by hasResolvedHeartbeatModelOverride Also in agent-runner-execution.ts: uses structural isLiveSessionModelSwitchError check (cross-module safe) and passes rethrowLiveSwitch: true. Upstream refs: openclaw/openclaw#56788, PR #57094, PR #57076
In entrypoint.sh, HOOKS_LOCATION_BLOCK is built as a double-quoted shell string. Using \\$var stores \ in the variable — when interpolated into the heredoc, nginx sees \ (escaped literal) instead of the variable $http_authorization, so the Authorization header from the upstream request is never forwarded to the gateway. Fix: use \ (single escape) so the variable stores bare $var, which the heredoc passes through correctly as a nginx variable reference. Same bug applied to $host, $remote_addr, $proxy_add_x_forwarded_for, and $scheme in the same block — all fixed. Symptom: hooks endpoint returns 401 after every container restart because nginx passes the literal string '$http_authorization' instead of the actual Authorization header value.
Bug: when OpenClaw starts, the LINE plugin registers its webhook route (/line/webhook) but the bundler splits runtime.ts into two chunks. The chunk that initialises the global registry object first creates it without the httpRoute fields the other chunk expects — so the LINE plugin registers into one registry and the HTTP server queries a different one. Result: 404 on cold start, works after hot-reload. Upstream fix PRs #53642 and #54686 are open but both stale against the current refactored source (httpRoute structure changed completely). Workaround: a background process in entrypoint.sh waits 20s after gateway start, then writes a temporary `_reloadTs` field to openclaw.json. The gateway's file-watcher detects the change and hot-reloads the LINE channel, re-registering routes into the correct registry. The field is removed 5s later. This is explicitly documented as a workaround. Remove this block once a real fix ships from upstream and is included in the source we build from. See the extensive comment block in entrypoint.sh for full details, upstream issue links, and the two PRs tracking the real fix. Upstream issue: openclaw/openclaw#49803 Upstream PRs: openclaw/openclaw#53642, openclaw/openclaw#54686
Root cause: context files (MEMORY.md, SOUL.md, USER.md, etc.) were embedded in the stable system prompt prefix (before OPENCLAW_CACHE_BOUNDARY). When the agent writes to MEMORY.md during a turn, the stable prefix changes on the next API call → Anthropic sees a different system prompt → cache miss. Fix: move the '# Project Context' section from before to after the cache boundary in src/agents/system-prompt.ts. The static instruction sections (tools, safety, skills, messaging, etc.) are identical across turns and are safely cached. Context files change frequently and belong in the dynamic suffix. Impact: cache hit rate ~0% → ~80-90%, cutting per-conversation token cost significantly for users with active memory.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes prompt caching showing 0% cache hit rate for users with active agent memory.
Root Cause
Context files (MEMORY.md, SOUL.md, USER.md, etc.) were embedded in the stable prefix — the portion of the system prompt before
<!-- OPENCLAW_CACHE_BOUNDARY -->that receivescache_control: ephemeral.When the agent writes to MEMORY.md during a conversation turn (standard behavior for any agent using memory), the stable prefix content changes on the next API call. Anthropic sees a different system prompt → cache miss. This happens on nearly every turn for active instances.
The
userTimeparameter (computed vianew Date()) was a suspected cause but is a red herring — it's a dead parameter that's computed but never used in the prompt output.Fix
Move the
# Project Contextsection from before to after the cache boundary insrc/agents/system-prompt.ts.The static instruction sections (tools, safety, skills, messaging, reply tags, voice, etc.) are identical across turns and are safely cached. Context files change frequently and belong in the dynamic suffix alongside
extraSystemPromptand heartbeat prompts.Impact
Cache hit rate: ~0% → ~80-90% for instances with active memory writes between turns.
For instances where context files never change, there is no behavioral difference — the content appears in the same relative position in the prompt, just after the cache seam.
Files Changed
src/agents/system-prompt.ts— moved# Project Contextblock (~25 lines) from line ~647 to afterlines.push(SYSTEM_PROMPT_CACHE_BOUNDARY)