fix(review-queue): block p0 p1 evidence dissent by scarmani · Pull Request #8467 · synaptent/aragora

scarmani · 2026-06-16T03:03:14Z

Summary

Treat standalone [P0] and [P1] findings in structured review/evidence comments as blocking/negative verdicts.
Prevent those comments from counting as supportive reviewer signals or focused dogfood evidence.
Add evidence-lint and merge-quorum regressions while preserving positive PASS evidence parsing.

Validation

/Users/armand/.pyenv/versions/3.11.11/bin/python -m pytest tests/cli/commands/test_review_queue.py -k "quorum or evidence or reviewer or dogfood or dissent or blocking"
ruff check aragora/cli/commands/review_queue_comment_verdicts.py tests/cli/commands/test_review_queue.py
git diff --check
bash scripts/automation_pr_preflight.sh origin/main HEAD
review-queue evidence-lint probes: [P0] and [P1] now would_count=false; positive Verdict: PASS remains would_count=true

Scope

No #8461 comments, ready-state changes, settlements, workflow reruns, or merges were performed.

scarmani · 2026-06-16T03:51:04Z

Codex independent semantic review on head `add383e`

Reviewer harness: codex
Model family: openai
Model id: GPT-5 Codex
Receipt artifact: /private/tmp/aragora-8467-evidence-20260616T034501Z/codex-openai-review.md

Verdict: PASS

Blocking findings: none.

I reviewed only the #8467 exact-head diff against current origin/main. The change is narrowly scoped to the review-queue comment verdict helper plus focused review-queue tests. The implementation reuses the existing decoration stripper before checking high-priority finding markers, so decorated priority findings are rejected consistently with the parser's existing handling for headings, blockquotes, bullets, and numbered lists.

Validation performed:

pytest tests/cli/commands/test_review_queue.py -k 'quorum or evidence or reviewer or dogfood or dissent or blocking': 139 passed, 137 deselected.
ruff check aragora/cli/commands/review_queue_comment_verdicts.py tests/cli/commands/test_review_queue.py: passed.
git diff --check origin/main...HEAD: passed.

Focused adversarial dogfood: I exercised the exact dissent-laundering surface with plain, bullet, numbered, blockquote, and heading high-priority marker forms plus an explicit changes-requested verdict. Every dissent probe returned would_count=false, and a clean positive PASS probe still returned would_count=true.

scarmani · 2026-06-16T03:51:04Z

Claude independent semantic review on head `add383e`

Reviewer harness: claude
Model family: claude
Model id: Claude Code 2.1.178
Receipt artifact: /private/tmp/aragora-8467-evidence-20260616T034501Z/claude-review.md

Verdict: PASS

Blocking findings: none.

Claude Code reviewed only the #8467 exact-head diff against current origin/main and found the implementation minimal, scoped, and correctly routed through the existing decoration-stripping helper before high-priority marker detection. The review confirmed that plain, bullet, numbered, blockquote, heading, and bold marker forms route to blocking/non-counting evidence, while positive evidence without those dissent markers remains countable.

Focused adversarial dogfood: the independent review checked the P0/P1 and changes-requested dissent paths, quorum-level non-counting behavior for dissenting comments, and positive-evidence preservation for clean PASS comments.

github-actions · 2026-06-16T04:14:41Z

Aragora Code Review

Advisory-only review. No issues found.

Forces off_limits=LEAVE on any PR touching merge-authority surfaces (review_queue*, swarm/quorum_evidence, review/evidence, settle_*/settlement, .github/workflows/aragora-merge-quorum*) regardless of tier/quorum, so the drain can never self-modify the gate. Surfaced after the first pass merged an evidence-parser change (#8467). +2 tests; file-paths fetched for classification. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…n default) (#8471) * feat(swarm): scripts/boss_drain_pass.py — runnable bounded drain pass (dry-run default) Wires the tested drain core (boss_drain/drain_pass/drain_policy) to real gh I/O. --dry-run (DEFAULT) prints the plan and executes nothing; --apply acts. MERGE re-confirms authorization with settle_one_pr at apply time (gate stays sole authority; tier-4/over-tier never auto-merge); CLOSE only on empty; off-limits (structex/, claude/fusion-, pinned PRs) LEFT untouched; REPAIR is label-only (drain-repair) in v1 and bounded by the pass caps. Lets the operator/boss loop drain the backlog over-cap instead of idling — no boss_loop.py edit required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): drain merge-authority guard — never auto-merge gate logic Forces off_limits=LEAVE on any PR touching merge-authority surfaces (review_queue*, swarm/quorum_evidence, review/evidence, settle_*/settlement, .github/workflows/aragora-merge-quorum*) regardless of tier/quorum, so the drain can never self-modify the gate. Surfaced after the first pass merged an evidence-parser change (#8467). +2 tests; file-paths fetched for classification. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): REPAIR auto-dispatch v2 (scope-locked, double-flag-gated) Turns REPAIR from label-only into a bounded worker dispatch that fixes a red-but-useful PR's failing required checks on its own branch. Safety: - make_repair_order/make_repair_prompt (pure, tested): scope-locked directive — work only on the PR branch, minimal diff, NEVER touch gate-logic surfaces, bounded effort (STOP rather than thrash). - Dispatch requires BOTH --apply AND --enable-repair-dispatch; without the explicit enable flag it only prints the plan (autonomous repair can never start by accident). Bounded by drain-pass caps (<=2/pass). Isolated worktree on the PR branch; one failure never aborts the pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): wire drain into the boss cycle (over-cap -> drain, not idle) When the backlog gate reports `shepherd` (open PRs >= cap), the cycle now runs one bounded drain pass instead of only idling/manufacturing work — merge fully-green PRs through the settle gate, close empty ones, PLAN repairs. Repair is never auto-dispatched here (no --enable-repair-dispatch), so it only prints the bounded plan. Default OFF (ARAGORA_DRAIN_ENABLED=1 to turn on); dry-run unless ARAGORA_DRAIN_APPLY=1; off-limits structex/ + claude/. Best-effort — never fails the cycle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(swarm): address quorum-review blockers on drain repair-dispatch Claude's merge-quorum review requested changes; fixed: - dispatch_repair: push the worktree to the PR branch ONLY if the agent run succeeded (run.returncode == 0), and report success only if the push landed. A broken / scope-violating / timed-out worker no longer propagates its leftovers to the branch. - build_candidates: restore the cheap pre-filter — off-limits-by-id/prefix, owned, and superseded PRs route to LEAVE/CLOSE without a `gh pr view`. Only path-authority candidates pay for the detail fetch (was +300 gh calls/pass at queue size 60-300 — every structex/* and claude/* branch). - dispatch_repair: shutil.rmtree backstop in finally so a failed `worktree add` doesn't leak the mkdtemp dir. - tests: off_limits/owned -> LEAVE regardless of has_changes (proves a repair worker can never spawn on another fleet's branch); cheap pre-filter skips the detail fetch for off-limits-by-prefix PRs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…red authority (#8471 follow-ups) (#8473) * feat(swarm): scripts/boss_drain_pass.py — runnable bounded drain pass (dry-run default) Wires the tested drain core (boss_drain/drain_pass/drain_policy) to real gh I/O. --dry-run (DEFAULT) prints the plan and executes nothing; --apply acts. MERGE re-confirms authorization with settle_one_pr at apply time (gate stays sole authority; tier-4/over-tier never auto-merge); CLOSE only on empty; off-limits (structex/, claude/fusion-, pinned PRs) LEFT untouched; REPAIR is label-only (drain-repair) in v1 and bounded by the pass caps. Lets the operator/boss loop drain the backlog over-cap instead of idling — no boss_loop.py edit required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): drain merge-authority guard — never auto-merge gate logic Forces off_limits=LEAVE on any PR touching merge-authority surfaces (review_queue*, swarm/quorum_evidence, review/evidence, settle_*/settlement, .github/workflows/aragora-merge-quorum*) regardless of tier/quorum, so the drain can never self-modify the gate. Surfaced after the first pass merged an evidence-parser change (#8467). +2 tests; file-paths fetched for classification. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): REPAIR auto-dispatch v2 (scope-locked, double-flag-gated) Turns REPAIR from label-only into a bounded worker dispatch that fixes a red-but-useful PR's failing required checks on its own branch. Safety: - make_repair_order/make_repair_prompt (pure, tested): scope-locked directive — work only on the PR branch, minimal diff, NEVER touch gate-logic surfaces, bounded effort (STOP rather than thrash). - Dispatch requires BOTH --apply AND --enable-repair-dispatch; without the explicit enable flag it only prints the plan (autonomous repair can never start by accident). Bounded by drain-pass caps (<=2/pass). Isolated worktree on the PR branch; one failure never aborts the pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(swarm): wire drain into the boss cycle (over-cap -> drain, not idle) When the backlog gate reports `shepherd` (open PRs >= cap), the cycle now runs one bounded drain pass instead of only idling/manufacturing work — merge fully-green PRs through the settle gate, close empty ones, PLAN repairs. Repair is never auto-dispatched here (no --enable-repair-dispatch), so it only prints the bounded plan. Default OFF (ARAGORA_DRAIN_ENABLED=1 to turn on); dry-run unless ARAGORA_DRAIN_APPLY=1; off-limits structex/ + claude/. Best-effort — never fails the cycle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(swarm): harden boss-drain repair-apply, cheap pre-filter, anchored authority (PR #8471 follow-ups) Quorum-reviewer follow-ups to the drain stack, separate from the landing PR: P2 — correctness: - dispatch_repair (boss_drain_pass): the apply path ran the agent then did an unconditional `git push` and returned True on the AGENT's returncode, ignoring whether anything was committed or the push succeeded. Now require a new commit beyond the PR's remote tip, assert push returncode==0, and confirm the remote tip actually advanced to the new commit — return False otherwise. - build_candidates (boss_drain): restore the cheap pre-filter. Off-limits-by- id/prefix, owned, and explicitly-superseded PRs now classify from the list view alone (no per-PR `gh pr view`); only survivors fetch detail for the merge- authority probe + changedFiles. Avoids up to +300 gh calls/pass at --list-limit 300. P3 — hardening: - dispatch_repair worktree leak: worktree now lives at a fresh non-existent subpath of an mkdtemp base; finally adds a shutil.rmtree(base) backstop so a failed `git worktree add` (never registered, so `worktree remove` can't clean it) no longer leaks the temp dir. - _proxy_authorized: collapse-by-last-write over duplicate check names replaced with any-SUCCESS semantics, so a stale/later pending re-run row can't mask an earlier green (ordering-independent). settle_one_pr still re-confirms before merge. - MERGE_AUTHORITY_PATTERNS anchored to known path segments; bare "settlement" removed (it fenced unrelated modules like debate/settlement.py, markets/ credit_settlement.py). Real gate surfaces stay fenced. - The 5 required-check names single-sourced as REQUIRED_CHECK_NAMES in boss_drain; the repair prompt and boss_drain_pass._REQUIRED both derive from it (no drift). TDD: tests/swarm/test_boss_drain.py extended (+11 cases). mypy clean; pre-commit clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: codex[bot] <codex[bot]@users.noreply.github.com>

scarmani added 2 commits June 15, 2026 22:01

fix(review-queue): block p0 p1 evidence dissent

dd65758

fix(review-queue): block decorated priority dissent

add383e

scarmani marked this pull request as ready for review June 16, 2026 04:07

scarmani requested a review from an0mium as a code owner June 16, 2026 04:07

scarmani merged commit 1c291f7 into main Jun 16, 2026
157 of 161 checks passed

scarmani deleted the codex/review-queue-p1-dissent branch June 16, 2026 04:20

This was referenced Jun 16, 2026

[automation] Stage-Gate Conductor Log #7162

Open

[automation] Founder Brief #6106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(review-queue): block p0 p1 evidence dissent#8467

fix(review-queue): block p0 p1 evidence dissent#8467
scarmani merged 2 commits into
mainfrom
codex/review-queue-p1-dissent

scarmani commented Jun 16, 2026

Uh oh!

scarmani commented Jun 16, 2026

Uh oh!

scarmani commented Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

scarmani commented Jun 16, 2026

Summary

Validation

Scope

Uh oh!

scarmani commented Jun 16, 2026

Codex independent semantic review on head add383e

Uh oh!

scarmani commented Jun 16, 2026

Claude independent semantic review on head add383e

Uh oh!

github-actions Bot commented Jun 16, 2026

Aragora Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Codex independent semantic review on head `add383e`

Claude independent semantic review on head `add383e`