feat(automation): Tier-4 merge-train to serialize contention-surface PRs#8414
Conversation
Adds scripts/tier4_merge_train.py — a single-lane serializer for PRs that
touch a Tier-4 merge-authority surface (scripts/settle_tier4_pr.py is the
worst offender). When many open PRs edit the same Tier-4 file, each must run
the full human-settlement dance AND re-conflicts the others on merge — the
fleet generates the very settlement toil the gate exists to prevent (a
six-way pile-up where each landing forces a rebase of all the rest).
The merge-train fixes the class:
- At most `cap` open PRs (default 1) may occupy a serialized surface; the
rest are queued in a deterministic order (oldest-first) behind the
head-of-train.
- `--check` returns exit 2 (queue) when a lane is full, so a pre-open hook
can refuse the (N+1)th PR; exit 0 when the lane has room.
- `--status` prints the current train per surface (HEAD + queued), so the
operator can land them head-first, rebasing each on the prior.
Design:
- Pure core (evaluate_merge_train / build_train / serialized_paths_for) is
stdlib-only and unit-tested without a live gh; all GitHub I/O is at the
edges and App-token-routed via scripts/gh_app_env.py, so the check never
burns the operator PAT's GraphQL budget.
- The serialized-surface list mirrors review_queue.TIER_4_PREFIXES (the
canonical Tier-4 classifier). It is vendored to keep the module
dependency-light; a drift-guard test fails if the two diverge (and
confirms the vendored copy currently matches exactly).
Integration:
- automation_pr_preflight.sh gains an opt-in block (ARAGORA_TIER4_MERGE_TRAIN=1,
default OFF) that runs the check before "preflight: ok". Best-effort: a
transport/tooling failure warns and does NOT block; only a clean queue
decision (exit 2) fails the preflight. Does not touch settle_tier4_pr.py,
settle_one_pr.py, or review_queue.py.
Validation:
- pytest tests/scripts/test_tier4_merge_train.py -> 16 passed (incl. drift guard)
- ruff check + ruff format --check clean
- bash -n scripts/automation_pr_preflight.sh OK; default-off path unchanged
https://claude.ai/code/session_018jfJj5gb9VoLs6VBMnzhrP
Aragora Code ReviewAdvisory-only review. No issues found. |
Claude independent model reviewReviewer: claude (anthropic) — independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head. Verdict: PASS No blocking issues. Opt-in (
Security: subprocess invocations all use list-form args, no shell, timeouts set, env merge is Regression risk: gated by env flag, default-off, non-fatal on tooling failure — safe to land. dogfood: yes |
Grok independent model reviewReviewer: grok (xai) — independent adversarial model review via the Aragora Grok reviewer, grounded on the exact PR head. Verdict: PASS
dogfood: yes |
Problem
When many open PRs edit the same Tier-4 merge-authority file, each must run the full human-settlement dance and re-conflicts the others on merge. The fleet ends up generating the very settlement toil the gate exists to prevent. Live example from today: six concurrent open PRs all editing
scripts/settle_tier4_pr.py(#8382, #8405, #8406, #8408→closed-dup, #8410, #8412) — every landing forces a rebase of all the rest, and the identical pattern produced a duplicate fix (#8408 vs #8412).What this adds
scripts/tier4_merge_train.py— a single-lane serializer for PRs touching a Tier-4 surface:capopen PRs (default 1) may occupy a serialized surface; the rest are queued in a deterministic oldest-first order behind the head-of-train.--check→ exit2when a lane is full (so a pre-open hook can refuse the (N+1)th PR), exit0when there's room.--status→ prints the current train per surface (HEAD + queued), so PRs land head-first, each rebased on the prior.Run against today's live cluster it reconstructs the train exactly:
Design
evaluate_merge_train/build_train/serialized_paths_for) is stdlib-only and unit-tested without a livegh. All GitHub I/O is at the edges (fetch_open_prs) and App-token-routed viascripts/gh_app_env.py, so the check never burns the operator PAT's GraphQL budget.review_queue.TIER_4_PREFIXES(the canonical Tier-4 classifier). It's vendored to keep the module dependency-light; a drift-guard test fails if the two diverge (and currently confirms they match exactly).Integration (opt-in, default OFF)
automation_pr_preflight.shgains a block gated onARAGORA_TIER4_MERGE_TRAIN=1that runs the check beforepreflight: ok. Best-effort: a transport/tooling failure warns and does not block; only a clean queue decision (exit 2) fails the preflight. Mirrors the existingARAGORA_GH_APP_AUTH/ARAGORA_QUORUM_RECONCILERopt-in pattern.Does not touch
settle_tier4_pr.py,settle_one_pr.py, orreview_queue.py— no merge-authority surface is modified; this is a new standalone guard.Validation
pytest tests/scripts/test_tier4_merge_train.py→ 16 passed (incl. the drift guard, which importedreview_queue.TIER_4_PREFIXESand matched).ruff check+ruff format --checkclean on both files.bash -n scripts/automation_pr_preflight.shOK; default-off path emits zero merge-train lines (behavior unchanged until opted in).Why this is the high-leverage fix
This addresses the class — it would have prevented today's six-PR
settle_tier4_pr.pypile-up and the #8408/#8412 duplicate — rather than any single PR landing. Tier classification: this PR itself touches only a new script + a Tier-2 automation shell (automation_pr_preflight.sh), not a Tier-4 surface, so it should settle normally.https://claude.ai/code/session_018jfJj5gb9VoLs6VBMnzhrP
Generated by Claude Code