Skip to content

feat(automation): Tier-4 merge-train to serialize contention-surface PRs#8414

Merged
an0mium merged 1 commit into
mainfrom
claude/tier4-merge-train
Jun 14, 2026
Merged

feat(automation): Tier-4 merge-train to serialize contention-surface PRs#8414
an0mium merged 1 commit into
mainfrom
claude/tier4-merge-train

Conversation

@an0mium

@an0mium an0mium commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Problem

When many open PRs edit the same Tier-4 merge-authority file, each must run the full human-settlement dance and re-conflicts the others on merge. The fleet ends up generating the very settlement toil the gate exists to prevent. Live example from today: six concurrent open PRs all editing scripts/settle_tier4_pr.py (#8382, #8405, #8406, #8408→closed-dup, #8410, #8412) — every landing forces a rebase of all the rest, and the identical pattern produced a duplicate fix (#8408 vs #8412).

What this adds

scripts/tier4_merge_train.py — a single-lane serializer for PRs touching a Tier-4 surface:

  • At most cap open PRs (default 1) may occupy a serialized surface; the rest are queued in a deterministic oldest-first order behind the head-of-train.
  • --check → exit 2 when a lane is full (so a pre-open hook can refuse the (N+1)th PR), exit 0 when there's room.
  • --status → prints the current train per surface (HEAD + queued), so PRs land head-first, each rebased on the prior.

Run against today's live cluster it reconstructs the train exactly:

merge-train: scripts/settle_tier4_pr.py (cap=1)
  [HEAD] PR #8405
  [queued #1] PR #8406
  [queued #2] PR #8412

Design

  • Pure core (evaluate_merge_train / build_train / serialized_paths_for) is stdlib-only and unit-tested without a live gh. All GitHub I/O is at the edges (fetch_open_prs) and App-token-routed via scripts/gh_app_env.py, so the check never burns the operator PAT's GraphQL budget.
  • The serialized-surface list mirrors review_queue.TIER_4_PREFIXES (the canonical Tier-4 classifier). It's vendored to keep the module dependency-light; a drift-guard test fails if the two diverge (and currently confirms they match exactly).

Integration (opt-in, default OFF)

automation_pr_preflight.sh gains a block gated on ARAGORA_TIER4_MERGE_TRAIN=1 that runs the check before preflight: ok. Best-effort: a transport/tooling failure warns and does not block; only a clean queue decision (exit 2) fails the preflight. Mirrors the existing ARAGORA_GH_APP_AUTH / ARAGORA_QUORUM_RECONCILER opt-in pattern.

Does not touch settle_tier4_pr.py, settle_one_pr.py, or review_queue.py — no merge-authority surface is modified; this is a new standalone guard.

Validation

  • pytest tests/scripts/test_tier4_merge_train.py16 passed (incl. the drift guard, which imported review_queue.TIER_4_PREFIXES and matched).
  • ruff check + ruff format --check clean on both files.
  • bash -n scripts/automation_pr_preflight.sh OK; default-off path emits zero merge-train lines (behavior unchanged until opted in).

Why this is the high-leverage fix

This addresses the class — it would have prevented today's six-PR settle_tier4_pr.py pile-up and the #8408/#8412 duplicate — rather than any single PR landing. Tier classification: this PR itself touches only a new script + a Tier-2 automation shell (automation_pr_preflight.sh), not a Tier-4 surface, so it should settle normally.

https://claude.ai/code/session_018jfJj5gb9VoLs6VBMnzhrP


Generated by Claude Code

Adds scripts/tier4_merge_train.py — a single-lane serializer for PRs that
touch a Tier-4 merge-authority surface (scripts/settle_tier4_pr.py is the
worst offender). When many open PRs edit the same Tier-4 file, each must run
the full human-settlement dance AND re-conflicts the others on merge — the
fleet generates the very settlement toil the gate exists to prevent (a
six-way pile-up where each landing forces a rebase of all the rest).

The merge-train fixes the class:
  - At most `cap` open PRs (default 1) may occupy a serialized surface; the
    rest are queued in a deterministic order (oldest-first) behind the
    head-of-train.
  - `--check` returns exit 2 (queue) when a lane is full, so a pre-open hook
    can refuse the (N+1)th PR; exit 0 when the lane has room.
  - `--status` prints the current train per surface (HEAD + queued), so the
    operator can land them head-first, rebasing each on the prior.

Design:
  - Pure core (evaluate_merge_train / build_train / serialized_paths_for) is
    stdlib-only and unit-tested without a live gh; all GitHub I/O is at the
    edges and App-token-routed via scripts/gh_app_env.py, so the check never
    burns the operator PAT's GraphQL budget.
  - The serialized-surface list mirrors review_queue.TIER_4_PREFIXES (the
    canonical Tier-4 classifier). It is vendored to keep the module
    dependency-light; a drift-guard test fails if the two diverge (and
    confirms the vendored copy currently matches exactly).

Integration:
  - automation_pr_preflight.sh gains an opt-in block (ARAGORA_TIER4_MERGE_TRAIN=1,
    default OFF) that runs the check before "preflight: ok". Best-effort: a
    transport/tooling failure warns and does NOT block; only a clean queue
    decision (exit 2) fails the preflight. Does not touch settle_tier4_pr.py,
    settle_one_pr.py, or review_queue.py.

Validation:
  - pytest tests/scripts/test_tier4_merge_train.py -> 16 passed (incl. drift guard)
  - ruff check + ruff format --check clean
  - bash -n scripts/automation_pr_preflight.sh OK; default-off path unchanged

https://claude.ai/code/session_018jfJj5gb9VoLs6VBMnzhrP
@aragora-automation-fable aragora-automation-fable Bot marked this pull request as ready for review June 14, 2026 09:37
@github-actions

Copy link
Copy Markdown
Contributor

Aragora Code Review

Advisory-only review. No issues found.

@scarmani

Copy link
Copy Markdown
Collaborator

Claude independent model review

Reviewer: claude (anthropic) — independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
Head: cf3af9b (cf3af9b), committed 2026-06-14T09:10:03Z.
PR: #8414.
Model family: claude

Verdict: PASS

No blocking issues. Opt-in (ARAGORA_TIER4_MERGE_TRAIN=1 default off), best-effort (transport errors warn rather than block), pure-core well-tested, App-token-routed so no PAT burn. Concrete notes:

  • [P2] scripts/tier4_merge_train.py:fetch_open_prsgh pr list --json files truncates large PR file lists (gh caps around ~100 files per PR per Files field). A monster PR whose Tier-4 path sits beyond that cap will silently bypass the lane. Worth a comment/log note, and consider falling back to gh pr view <n> --json files for the candidate set if --candidate-pr is set.
  • [P3] scripts/automation_pr_preflight.sh:235paste -sd, - then --changed-files split on , corrupts paths containing literal commas. Real risk is near-zero, but --changed-files could be made repeatable (already is in argparse) and the shell could call it multiple times instead of CSV.
  • [P3] tier4_merge_train.py:_pr_number returns None/0 fallback; combined with _pr_sort_key casting failure to 0, two malformed PR records would sort-tie. Not exploitable; just noting.
  • [P3] tier4_merge_train.py:_app_token_env swallows OSError/SubprocessError and degrades to ambient gh auth — that's the intended best-effort behavior, but a one-line stderr breadcrumb ("App token mint failed, using ambient gh") would make field debugging painless.
  • [P3] Drift guard tests/scripts/test_tier4_merge_train.py:test_serialized_prefixes_match_canonical_tier4 uses importorskip — fine, but in CI where the import should succeed the skip would hide drift. Consider making it hard-fail when running inside the repo's main test env (e.g. when ARAGORA_TEST_FULL=1).

Security: subprocess invocations all use list-form args, no shell, timeouts set, env merge is {**os.environ, **_app_token_env()} (token overlays cleanly, never logged). Token leakage path looks clean.

Regression risk: gated by env flag, default-off, non-fatal on tooling failure — safe to land.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator

Grok independent model review

Reviewer: grok (xai) — independent adversarial model review via the Aragora Grok reviewer, grounded on the exact PR head.
Head: cf3af9b (cf3af9b), committed 2026-06-14T09:10:03Z.
PR: #8414.
Model family: grok

Verdict: PASS

  • No blocking issues. (opt-in guard + pure core + tests + App-token isolation all reduce risk; no command injection, drift, or cap/exclusion bugs observed)

dogfood: yes

@an0mium an0mium merged commit adbe8da into main Jun 14, 2026
152 of 154 checks passed
@an0mium an0mium deleted the claude/tier4-merge-train branch June 14, 2026 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants