Skip to content

Eliminate session_token transcription drift via MCP-client interceptor CLI #112

@m2ux

Description

@m2ux

Background

PR #1466 documented a class of bug where the agent (Claude Code's LLM) emitted a session_token in an MCP tool call's arguments field with a few characters wrong — specifically the psid UUID was mangled by inserting two extra hex characters. HMAC verification failed server-side and the workflow couldn't continue.

The root cause is structural: every tool call after start_session requires the LLM to re-type the entire ~480-character HMAC-signed token into the next call's arguments. The LLM is reliable but not infallible at long opaque-string transcription, and corruption probability scales with token length.

Investigation summary

A prior work package (enhancement/session-token-size-optimization, branch tip f7a4cd8) explored shrinking the wire token. Two commits landed on the branch:

  • 1cd7d56 feat(session): add SessionStore, CBOR wire codec, state_hash modules
  • f7a4cd8 feat(session): switch wire format to CBOR; move state to SessionStore

These commits change the wire format from ~480-char JSON-payload + hex-HMAC to ~140-char CBOR-payload + base64url-HMAC, with a server-side SessionRecord keyed by sid and a 16-byte truncated SHA-256 state attestation on the wire. Phases 3-6 + E2E test were planned but not implemented. The work covers the original transcription problem indirectly: smaller token → ~75% lower corruption probability.

Pivot

A subsequent investigation surfaced a more direct fix: the Claude Code harness already supports PreToolUse hooks that can rewrite a tool call's arguments before dispatch. Combined with a PostToolUse hook that captures the token from each response's _meta, the harness can own the token lifecycle entirely — the LLM never types it. Transcription drift becomes structurally impossible, not just less likely.

The hook capability isn't Claude Code-specific. Five major MCP clients ship equivalent mechanisms today:

  • Claude CodePreToolUse hook, updatedInput response key.
  • Claude Agent SDK — programmatic callback.
  • Cursor ≥1.7 — beforeMCPExecution hook.
  • OpenCodetool.execute.before plugin handler.
  • OpenAI Codex CLIPreToolUse hook (PR #18385).

MCP protocol-level interceptors (SEP-2624, draft) are the future-proof standardization but are not yet merged and no SDK has shipped a reference implementation. Concrete prototypes exist only in third-party repos (mcp-hangar Python, tower-mcp Rust, sint-protocol).

Proposed work package

Ship workflow-server-interceptor — a TypeScript CLI bundled as a bin entry of the existing @m2ux/workflow-server npm package.

Functional spec

Two subcommands, both reading a single JSON event on stdin:

workflow-server-interceptor inject    # PreToolUse hook
workflow-server-interceptor capture   # PostToolUse hook

inject reads stdin, reads ~/.claude/workflow-server-tokens/current.token, and emits { hookSpecificOutput: { hookEventName: 'PreToolUse', updatedInput: { ...tool_input, session_token: <token> } } } on stdout. Skips injection when:

  • tool_name === 'mcp__workflow-server__start_session' (caller may pass a saved token to resume).
  • session_token or checkpoint_handle already in tool_input (don't clobber).
  • State file missing or empty.

capture reads stdin, extracts tool_response._meta.session_token, writes it to ~/.claude/workflow-server-tokens/current.token with 0600 permissions.

Deliverables

  • src/hooks/cli.ts — single-file CLI implementing both subcommands (~70 LOC pure stdlib).
  • package.json — new bin entry: "workflow-server-interceptor": "dist/hooks/cli.js".
  • tests/hooks-cli.test.ts — unit tests covering inject/capture/skip-paths.
  • docs/interceptor-recipe.md — settings.json snippets for Claude Code + cross-harness equivalents (Cursor's beforeMCPExecution, OpenCode's plugin, Codex's PreToolUse, Claude Agent SDK's callback).
  • examples/interceptor/ — copy-pasteable config samples per harness.

Out of scope (v1)

  • Multi-session keying. The simple single-file state design works for one workflow-server session at a time. Users with concurrent workflows in one Claude Code conversation will need a sid-keyed variant (deferred to v2 with usage data).
  • Native (Rust / Bun-compiled) build. Node startup (~50 ms per call) is acceptable for v1. Revisit if real-world latency complaints accumulate.
  • Workflow-server-side changes. The server's wire format and SessionStore stay as-is.

Status of the prior work-package branch

Branch enhancement/session-token-size-optimization (workflow-server) and enhancement/session-token-size-optimization-meta (workflows submodule) hold the tier-C commits. They are not merged. Options for the new work package's plan-prepare:

  1. Revert — drop tier-C entirely; ship only the interceptor.
  2. Keep as defense-in-depth — finish tier-C alongside the interceptor; users on hook-less harnesses (Claude Desktop, Continue.dev, Cline, Roo Code, Zed) benefit from the smaller wire token even without the interceptor.
  3. Park — leave the branch alone; revisit later if/when SEP-2624 lands.

Decision deferred to the new work-package's plan-prepare activity.

References

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions