feat(codex): wire cross-model verification into verify, ship, fix, nuclear-review by arzafran · Pull Request #59 · darkroomengineering/cc-settings

arzafran · 2026-06-19T16:19:39Z

What this does

Wires the Codex bridge into the six skills where Claude reviews its own work — verify, ship, fix, nuclear-review, review, and refactor. In each, an independent review from a different model family (Codex) runs alongside Claude's, so a blind spot Claude shares with itself gets a second pair of eyes. It's opt-in by availability: if Codex isn't installed or logged in, every one of these skills runs exactly as before, Claude-only.

The point is the self-preferential-bias surface — Claude judging Claude tends to wave its own output through. A different family doesn't share those blind spots, so this is where the bridge earns its keep. Codex is the roomy pool (600–3,000 messages / 5h on Pro), so quota is not a reason to be sparing — the high-frequency review skills are wired in too.

Summary

verify — adds a non-Claude finder (codex-verifier) in parallel with the Claude Finder, merged into the issue superset before the Adversary stage. The skill's whole reason to exist is breaking self-review bias; its three-agent panel was 100% Claude.
ship — parallel cross-model review at Step 6, right before a commit lands. A Claude/Codex disagreement on a Critical/HIGH becomes a human gate before Step 7.
fix — cross-model review alongside the reviewer for security-sensitive fixes (auth, crypto, permissions, input validation).
nuclear-review — a whole-codebase ask pass folded into Phase 3 synthesis. Deliberately not the diff-scoped review — a repo-wide audit has no diff, so it uses codex-run.ts ask pointed at the tree.
review — runs as the reviewer agent (has Bash, no Agent tool), so it calls codex-run.ts review directly and reconciles findings into the verdict (HIGH→Critical, MEDIUM→Warning, LOW→Suggestion).
refactor — codex-verifier in parallel with the reviewer; refactors are a top source of subtle regressions, exactly where a second family pays off.

Diff-based skills with Agent access use the codex-verifier agent; review (a forked reviewer) and nuclear-review (diff-less) call codex-run.ts directly. Every insertion states the bridge is gated and fails open.

Test Plan

bun run lint:skills — 35 skills, 0 errors/warnings
Diff scoped to the six SKILL.md files only
Reviewer: confirm nuclear-review uses ask (correct for a diff-less audit) and review uses the direct codex-run.ts call (the reviewer fork has no Agent tool)

…clear-review Slots the Codex bridge into the four skills where Claude judges its own output — the self-preferential-bias surface cross-model review exists to break. Each insertion is gated and fails open: if the bridge is unavailable, the skill runs Claude-only. - verify: adds a non-Claude finder (codex-verifier) in parallel with the Claude Finder, merged into the superset before the Adversary stage — a different model family in an otherwise all-Claude panel. - ship: parallel cross-model review at Step 6, right before commit; a Claude/Codex disagreement on a Critical becomes a human gate before Step 7. - fix: cross-model review alongside the reviewer for security-sensitive fixes (auth, crypto, permissions, input validation). - nuclear-review: a whole-codebase 'ask' pass (NOT diff-scoped review — there is no diff in a repo-wide audit) folded into Phase 3 synthesis as a second opinion. Verifier uses the codex-verifier agent for diff-based skills; nuclear-review uses codex-run.ts ask directly since it audits the whole tree, not a change.

Codex is the roomy pool (600–3000 msgs / 5h on Pro) — quota was never the constraint, so the two high-frequency review skills get the cross-model pass as well. review runs as the reviewer agent (Bash, no Agent tool), so it calls codex-run.ts review directly; refactor is in orchestration mode and uses the codex-verifier agent in parallel — refactors are a top source of subtle regressions, exactly where a second model family pays off. Both gated, fail open.

arzafran added 2 commits June 19, 2026 13:19

arzafran merged commit ac5f632 into main Jun 19, 2026
15 checks passed

arzafran mentioned this pull request Jun 19, 2026

feat(codex): advisory cross-model semantic probe in proof-of-work #60

Merged

3 tasks

arzafran deleted the feat/codex-verifier-integration branch June 19, 2026 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(codex): wire cross-model verification into verify, ship, fix, nuclear-review#59

feat(codex): wire cross-model verification into verify, ship, fix, nuclear-review#59
arzafran merged 2 commits into
mainfrom
feat/codex-verifier-integration

arzafran commented Jun 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arzafran commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

Summary

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arzafran commented Jun 19, 2026 •

edited

Loading