Skip to content

Let ClawSweeper judge real behavior proof#48

Merged
pashpashpash merged 2 commits intomainfrom
codex/agent-led-proof-judgement
May 5, 2026
Merged

Let ClawSweeper judge real behavior proof#48
pashpashpash merged 2 commits intomainfrom
codex/agent-led-proof-judgement

Conversation

@pashpashpash
Copy link
Copy Markdown
Contributor

ClawSweeper already records a structured realBehaviorProof judgement and uses it to block pass/automerge markers, but it did not yet own the positive label that tells maintainers the evidence was convincing.

This makes the proof review explicitly agent-led. The review prompt tells Codex to inspect PR bodies, comments, screenshots, videos, logs, terminal output, and links with its own tools and best judgement. Review runs now give Codex a scratch directory plus a separate read-only inspection token, while the deterministic wrapper keeps the write token for comments and labels. During apply, ClawSweeper syncs proof: sufficient when the structured judgement is sufficient and removes it when the evidence is missing, weak, mock-only, or no longer applicable.

@pashpashpash pashpashpash merged commit 0fefca2 into main May 5, 2026
5 checks passed
@pashpashpash pashpashpash deleted the codex/agent-led-proof-judgement branch May 5, 2026 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant