Skip to content

F2-flip — confer + orchestrate + audit honor ctx.cheap_mode#105

Merged
fxspeiser merged 1 commit into
mainfrom
feature/f2-flip-cheap-routing
Jun 7, 2026
Merged

F2-flip — confer + orchestrate + audit honor ctx.cheap_mode#105
fxspeiser merged 1 commit into
mainfrom
feature/f2-flip-cheap-routing

Conversation

@fxspeiser

Copy link
Copy Markdown
Owner

Summary

Re-opened after #101 auto-closed when its base branch (#100's `feature/f2-prep-callcontext`) was deleted. Same diff as #101, rebased onto current main (which now has F2-prep squashed in).

The payoff to F2-prep (#100, now merged): closes the ~96%-scaffolding-cost leak the 2026-06-07 incident review uncovered in `create_cheap`. `ctx.cheap_mode` now threads through `confer`, `orchestrate`, and `audit`, biasing each picker toward the appropriate tier.

What lands

`src/tools/confer.ts` — the substantial change

Cheap-mode tier-aware panel narrowing right after `resolveProviders`. Triggers when ALL three hold:

  • `opts.ctx?.cheap_mode === true`
  • Caller did NOT explicitly pin a `providers` list (explicit > inherited)
  • `selected.length > 1` (no narrowing needed for a single-provider panel)

`selectForDifficulty({tier: "low", ...})` → retarget base provider to the picked cheap model → replace `selected` with the single retargeted entry. Falls back gracefully (full panel kept + reason surfaced in envelope) when no pricing or no low-tier candidate.

Envelope gains optional `cheap_mode_panel: {before, after, picked, reason}`.

`src/tools/orchestrate.ts` + `src/tools/audit.ts` — 1 line each

`args["cheap_mode"]` default now reads from `opts.ctx?.cheap_mode`. Explicit args still win.

Scoreboard

  • TS: 1,303 / 1,303
  • typecheck + build: clean

Test plan

  • `npm test` — full suite
  • `npm run build` — clean
  • 6 new threading tests in `call-context-threading.test.ts` cover the narrowing path + override semantics

🤖 Generated with Claude Code

The payoff to F2-prep: closes the ~96%-scaffolding-cost leak the
2026-06-07 incident review uncovered in create_cheap. Before F2-flip,
create_cheap's scope-opinion + peer-review steps ran on the full
premium panel regardless of cheap_mode; the cheap routing only kicked
in inside orchestrate's per-node executor. With F2-flip, ctx.cheap_mode
threads through confer, orchestrate, and audit and biases each of
their pickers toward the low/med tier as appropriate.

What lands

  src/tools/confer.ts — the substantial change:
    Cheap-mode tier-aware panel narrowing right after resolveProviders.
    Triggers when ALL three hold:
      - opts.ctx?.cheap_mode === true
      - caller did NOT explicitly pin a `providers` list (so we
        respect explicit caller intent; explicit > inherited)
      - selected.length > 1 (no narrowing needed for a single-prov panel)

    Path:
      1. Build availableSet from selected.
      2. selectForDifficulty({pricing, tier: "low", available, weights, allowOnly})
         — same ranker the orchestrate cheap-mode branch uses.
      3. On pick:
           - lookup the base provider in opts.providers
           - retargetProvider(base, pick.model) — the picked CHEAP
             model, not the provider's default
           - replace `selected` with [retargeted]
      4. On miss (no pricing / no low-tier candidate):
           - keep the full panel
           - surface the reason in `cheap_mode_panel.reason` so
             operators see we tried

    Envelope gains an optional `cheap_mode_panel` object:
      { before, after, picked: {provider, model} | null, reason }

  src/tools/orchestrate.ts (1 line of logic):
    args["cheap_mode"] default is now `opts.ctx?.cheap_mode ?? false`
    instead of plain `false`. Explicit args still win — caller pinning
    beats inherited context, so a sub-task can opt out of a cheap parent
    if it really needs the full premium DAG.

  src/tools/audit.ts (1 line of logic):
    Same pattern: args["cheap_mode"] default is
    `opts.ctx?.cheap_mode ?? true`. Audit's historical default was
    true (cheap auditor); ctx.cheap_mode=false from a premium parent
    macro now correctly biases toward a non-cheap auditor.

Tests (full suite green — 1,303 passing, +2 over F2-prep's 1,301)

  test/core/call-context-threading.test.ts is rewritten from the
  F2-prep "no-op invariant" canary into the F2-flip "ctx is active"
  canary. Six tests:
    confer (4):
      - pricing wired → panel narrows to 1, retarget to cheap model,
        envelope surfaces cheap_mode_panel meta
      - no ctx → full panel runs, no meta (baseline / regression guard)
      - ctx but no pricing → graceful fallback, full panel kept,
        meta.reason explains
      - caller-supplied providers list overrides ctx.cheap_mode
        (explicit > inherited)
    orchestrate (2):
      - ctx.cheap_mode=true (no args.cheap_mode) → worker calls
        retarget to low-tier model
      - args.cheap_mode=false explicit override beats ctx.cheap_mode=true

  No churn elsewhere: every existing test (confer, orchestrate,
  audit, create, parity, etc.) stays green because:
    - The narrowing path requires ctx.cheap_mode=true; legacy callers
      don't set ctx, so they don't trigger it.
    - args["cheap_mode"] explicit values still win — no envelope
      drift for any test that pinned cheap_mode directly.

Operator visibility

  When create_cheap runs, confer's envelope now carries:
    "cheap_mode_panel": {
      "before": 3, "after": 1,
      "picked": {"provider": "gemini", "model": "gemini-2.5-flash-lite"},
      "reason": "ctx.cheap_mode=true; narrowed to cheapest low-tier candidate"
    }
  Bubbles up through runCreate (review wraps confer, audit reports
  its own auditor pick) so the full cost-savings path is auditable
  in the resulting envelope without extra instrumentation.

Scoreboard
  TS: 1,303 / 1,303
  typecheck + build: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@fxspeiser fxspeiser merged commit 69701d6 into main Jun 7, 2026
8 checks passed
@fxspeiser fxspeiser deleted the feature/f2-flip-cheap-routing branch June 7, 2026 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant