fix(review-queue): block p0 p1 evidence dissent#8467
Conversation
Codex independent semantic review on head add383eReviewer harness: codex Verdict: PASS Blocking findings: none. I reviewed only the #8467 exact-head diff against current Validation performed:
Focused adversarial dogfood: I exercised the exact dissent-laundering surface with plain, bullet, numbered, blockquote, and heading high-priority marker forms plus an explicit changes-requested verdict. Every dissent probe returned |
Claude independent semantic review on head add383eReviewer harness: claude Verdict: PASS Blocking findings: none. Claude Code reviewed only the #8467 exact-head diff against current Focused adversarial dogfood: the independent review checked the P0/P1 and changes-requested dissent paths, quorum-level non-counting behavior for dissenting comments, and positive-evidence preservation for clean PASS comments. |
Aragora Code ReviewAdvisory-only review. No issues found. |
Forces off_limits=LEAVE on any PR touching merge-authority surfaces (review_queue*, swarm/quorum_evidence, review/evidence, settle_*/settlement, .github/workflows/aragora-merge-quorum*) regardless of tier/quorum, so the drain can never self-modify the gate. Surfaced after the first pass merged an evidence-parser change (#8467). +2 tests; file-paths fetched for classification. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n default) (#8471) * feat(swarm): scripts/boss_drain_pass.py — runnable bounded drain pass (dry-run default) Wires the tested drain core (boss_drain/drain_pass/drain_policy) to real gh I/O. --dry-run (DEFAULT) prints the plan and executes nothing; --apply acts. MERGE re-confirms authorization with settle_one_pr at apply time (gate stays sole authority; tier-4/over-tier never auto-merge); CLOSE only on empty; off-limits (structex/, claude/fusion-, pinned PRs) LEFT untouched; REPAIR is label-only (drain-repair) in v1 and bounded by the pass caps. Lets the operator/boss loop drain the backlog over-cap instead of idling — no boss_loop.py edit required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): drain merge-authority guard — never auto-merge gate logic Forces off_limits=LEAVE on any PR touching merge-authority surfaces (review_queue*, swarm/quorum_evidence, review/evidence, settle_*/settlement, .github/workflows/aragora-merge-quorum*) regardless of tier/quorum, so the drain can never self-modify the gate. Surfaced after the first pass merged an evidence-parser change (#8467). +2 tests; file-paths fetched for classification. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): REPAIR auto-dispatch v2 (scope-locked, double-flag-gated) Turns REPAIR from label-only into a bounded worker dispatch that fixes a red-but-useful PR's failing required checks on its own branch. Safety: - make_repair_order/make_repair_prompt (pure, tested): scope-locked directive — work only on the PR branch, minimal diff, NEVER touch gate-logic surfaces, bounded effort (STOP rather than thrash). - Dispatch requires BOTH --apply AND --enable-repair-dispatch; without the explicit enable flag it only prints the plan (autonomous repair can never start by accident). Bounded by drain-pass caps (<=2/pass). Isolated worktree on the PR branch; one failure never aborts the pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): wire drain into the boss cycle (over-cap -> drain, not idle) When the backlog gate reports `shepherd` (open PRs >= cap), the cycle now runs one bounded drain pass instead of only idling/manufacturing work — merge fully-green PRs through the settle gate, close empty ones, PLAN repairs. Repair is never auto-dispatched here (no --enable-repair-dispatch), so it only prints the bounded plan. Default OFF (ARAGORA_DRAIN_ENABLED=1 to turn on); dry-run unless ARAGORA_DRAIN_APPLY=1; off-limits structex/ + claude/. Best-effort — never fails the cycle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(swarm): address quorum-review blockers on drain repair-dispatch Claude's merge-quorum review requested changes; fixed: - dispatch_repair: push the worktree to the PR branch ONLY if the agent run succeeded (run.returncode == 0), and report success only if the push landed. A broken / scope-violating / timed-out worker no longer propagates its leftovers to the branch. - build_candidates: restore the cheap pre-filter — off-limits-by-id/prefix, owned, and superseded PRs route to LEAVE/CLOSE without a `gh pr view`. Only path-authority candidates pay for the detail fetch (was +300 gh calls/pass at queue size 60-300 — every structex/* and claude/* branch). - dispatch_repair: shutil.rmtree backstop in finally so a failed `worktree add` doesn't leak the mkdtemp dir. - tests: off_limits/owned -> LEAVE regardless of has_changes (proves a repair worker can never spawn on another fleet's branch); cheap pre-filter skips the detail fetch for off-limits-by-prefix PRs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…red authority (#8471 follow-ups) (#8473) * feat(swarm): scripts/boss_drain_pass.py — runnable bounded drain pass (dry-run default) Wires the tested drain core (boss_drain/drain_pass/drain_policy) to real gh I/O. --dry-run (DEFAULT) prints the plan and executes nothing; --apply acts. MERGE re-confirms authorization with settle_one_pr at apply time (gate stays sole authority; tier-4/over-tier never auto-merge); CLOSE only on empty; off-limits (structex/, claude/fusion-, pinned PRs) LEFT untouched; REPAIR is label-only (drain-repair) in v1 and bounded by the pass caps. Lets the operator/boss loop drain the backlog over-cap instead of idling — no boss_loop.py edit required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): drain merge-authority guard — never auto-merge gate logic Forces off_limits=LEAVE on any PR touching merge-authority surfaces (review_queue*, swarm/quorum_evidence, review/evidence, settle_*/settlement, .github/workflows/aragora-merge-quorum*) regardless of tier/quorum, so the drain can never self-modify the gate. Surfaced after the first pass merged an evidence-parser change (#8467). +2 tests; file-paths fetched for classification. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): REPAIR auto-dispatch v2 (scope-locked, double-flag-gated) Turns REPAIR from label-only into a bounded worker dispatch that fixes a red-but-useful PR's failing required checks on its own branch. Safety: - make_repair_order/make_repair_prompt (pure, tested): scope-locked directive — work only on the PR branch, minimal diff, NEVER touch gate-logic surfaces, bounded effort (STOP rather than thrash). - Dispatch requires BOTH --apply AND --enable-repair-dispatch; without the explicit enable flag it only prints the plan (autonomous repair can never start by accident). Bounded by drain-pass caps (<=2/pass). Isolated worktree on the PR branch; one failure never aborts the pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): wire drain into the boss cycle (over-cap -> drain, not idle) When the backlog gate reports `shepherd` (open PRs >= cap), the cycle now runs one bounded drain pass instead of only idling/manufacturing work — merge fully-green PRs through the settle gate, close empty ones, PLAN repairs. Repair is never auto-dispatched here (no --enable-repair-dispatch), so it only prints the bounded plan. Default OFF (ARAGORA_DRAIN_ENABLED=1 to turn on); dry-run unless ARAGORA_DRAIN_APPLY=1; off-limits structex/ + claude/. Best-effort — never fails the cycle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(swarm): harden boss-drain repair-apply, cheap pre-filter, anchored authority (PR #8471 follow-ups) Quorum-reviewer follow-ups to the drain stack, separate from the landing PR: P2 — correctness: - dispatch_repair (boss_drain_pass): the apply path ran the agent then did an unconditional `git push` and returned True on the AGENT's returncode, ignoring whether anything was committed or the push succeeded. Now require a new commit beyond the PR's remote tip, assert push returncode==0, and confirm the remote tip actually advanced to the new commit — return False otherwise. - build_candidates (boss_drain): restore the cheap pre-filter. Off-limits-by- id/prefix, owned, and explicitly-superseded PRs now classify from the list view alone (no per-PR `gh pr view`); only survivors fetch detail for the merge- authority probe + changedFiles. Avoids up to +300 gh calls/pass at --list-limit 300. P3 — hardening: - dispatch_repair worktree leak: worktree now lives at a fresh non-existent subpath of an mkdtemp base; finally adds a shutil.rmtree(base) backstop so a failed `git worktree add` (never registered, so `worktree remove` can't clean it) no longer leaks the temp dir. - _proxy_authorized: collapse-by-last-write over duplicate check names replaced with any-SUCCESS semantics, so a stale/later pending re-run row can't mask an earlier green (ordering-independent). settle_one_pr still re-confirms before merge. - MERGE_AUTHORITY_PATTERNS anchored to known path segments; bare "settlement" removed (it fenced unrelated modules like debate/settlement.py, markets/ credit_settlement.py). Real gate surfaces stay fenced. - The 5 required-check names single-sourced as REQUIRED_CHECK_NAMES in boss_drain; the repair prompt and boss_drain_pass._REQUIRED both derive from it (no drift). TDD: tests/swarm/test_boss_drain.py extended (+11 cases). mypy clean; pre-commit clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: codex[bot] <codex[bot]@users.noreply.github.com>
Summary
[P0]and[P1]findings in structured review/evidence comments as blocking/negative verdicts.Validation
/Users/armand/.pyenv/versions/3.11.11/bin/python -m pytest tests/cli/commands/test_review_queue.py -k "quorum or evidence or reviewer or dogfood or dissent or blocking"ruff check aragora/cli/commands/review_queue_comment_verdicts.py tests/cli/commands/test_review_queue.pygit diff --checkbash scripts/automation_pr_preflight.sh origin/main HEADreview-queue evidence-lintprobes:[P0]and[P1]nowwould_count=false; positiveVerdict: PASSremainswould_count=trueScope
No #8461 comments, ready-state changes, settlements, workflow reruns, or merges were performed.