Skip to content

feat(codex): advisory cross-model semantic probe in proof-of-work#60

Merged
arzafran merged 1 commit into
mainfrom
feat/codex-proof-probe
Jun 19, 2026
Merged

feat(codex): advisory cross-model semantic probe in proof-of-work#60
arzafran merged 1 commit into
mainfrom
feat/codex-proof-probe

Conversation

@arzafran

Copy link
Copy Markdown
Member

What this does

Adds an optional cross-model check to proof-of-work. The machine gate (bun run proof) proves a diff compiles, passes its tests, and lints clean — but not that it's correct; a bug that typechecks and passes the tests you wrote goes straight through. This lets you run a Codex review from a different model family as a semantic probe on top of that gate, so a whole class of "green but wrong" diffs gets a second look before a human spends attention on it.

It's advisory and opt-in: it never changes the review-ready verdict, and it's silent when Codex isn't available.

Summary

  • New "Advisory: cross-model semantic probe" section in skills/proof-of-work/SKILL.md.
  • Probe is codex-run.ts review, treated exactly like the existing react-doctor/deslop advisory probes — reported alongside the verdict, never flips it.
  • Explicitly kept out of bun run proof: that gate is cheapest-first and runs constantly, so baking in a remote model call would slow every proof. Documented to run deliberately on non-trivial diffs.
  • This is the library-wide chokepoint complement to PR feat(codex): wire cross-model verification into verify, ship, fix, nuclear-review #59 (per-skill verifier wiring): any skill that ends in proof can pick up the cross-model signal here without editing each one.

Test Plan

  • bun run lint:skills — 35 skills, 0 errors/warnings
  • Diff scoped to one SKILL.md file
  • Reviewer: confirm the probe stays advisory (never blocks) and is not wired into the proof script itself

The mechanical battery (typecheck/test/lint) proves a diff is self-consistent,
not correct — a bug that compiles and passes the tests you wrote sails through.
Adds an opt-in Codex review as a semantic probe on top of the gate, treated like
react-doctor/deslop: advisory, reported alongside the verdict, never flips it.

Deliberately kept OUT of `bun run proof` — that gate is cheapest-first and runs
constantly, so a remote model call would slow every proof. Run on non-trivial
diffs only. Gated and fails open.
@arzafran arzafran merged commit fc90604 into main Jun 19, 2026
15 checks passed
@arzafran arzafran deleted the feat/codex-proof-probe branch June 19, 2026 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant