Skip to content

fix(tier4): add REST fallback for settlement check#8406

Open
an0mium wants to merge 28 commits into
mainfrom
codex/settle-tier4-rest-fallback-20260614
Open

fix(tier4): add REST fallback for settlement check#8406
an0mium wants to merge 28 commits into
mainfrom
codex/settle-tier4-rest-fallback-20260614

Conversation

@an0mium

@an0mium an0mium commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • add REST fallback for settle_tier4_pr.py --check when GraphQL-backed gh pr view/checks are transport-blocked
  • preserve Tier 4 fail-closed required-check gating through branch protection plus exact-head check-runs/statuses
  • carry authorAssociation through REST-normalized comments/reviews so operator settlement comments remain auditable

Validation

  • python3 -m pytest tests/scripts/test_settle_tier4_pr.py -q
  • python3 -m ruff check scripts/settle_tier4_pr.py aragora/cli/commands/review_queue_rest_fallback.py tests/scripts/test_settle_tier4_pr.py
  • git diff --check
  • bash scripts/automation_pr_preflight.sh origin/main HEAD
  • python3 scripts/settle_tier4_pr.py --check --pr 8395 --head b205c57 --repo synaptent/aragora --cwd /private/tmp/aragora-settle-tier4-rest-fallback-20260614T0430Z --trusted-operator-login scarmani --trusted-operator-login an0mium --json

Draft because this is governance tooling and should receive independent evidence before ready/merge.

@an0mium

an0mium commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

Claude independent model review

PR: #8406
Exact head SHA: d4c573e
Committed date: 2026-06-14T01:06:27-05:00
Reviewer harness: claude-code CLI
Model family: claude
Model id: claude-opus-4-7[1m]
Receipt artifact: claude-cli-session:8dea0498-3e30-40a1-b8ed-a35af64c4a2f
Files reviewed: aragora/cli/commands/review_queue_rest_fallback.py; scripts/settle_tier4_pr.py; tests/scripts/test_settle_tier4_pr.py
dogfood: yes

Findings-first verdict: SUPPORTIVE / APPROVE. I found no P1/P2 correctness issues, no auth bypass, no transport-error swallowing, and no blocking test gaps in the current-head diff.

Review notes:

  • REST fallback now fails closed for strict branch protection by surfacing required contexts as STRICT_BASE_REQUIRED instead of silently returning an empty required-check list.
  • Primary gh pr view and gh pr checks calls now pass --repo, avoiding cwd-derived repository ambiguity before fallback.
  • Numeric not_ready allowance is split from true Tier 4 human-settlement authorization. The draft-only diagnostic allowance still leaves the draft blocker in place, and the non-draft repair_or_wait case remains unexpected.
  • REST check-run rollup state now uses the canonical run-state normalizer.

Validation reviewed: the branch reports focused tests passed for tests/scripts/test_settle_tier4_pr.py, ruff passed on touched files, git diff --check passed, automation_pr_preflight.sh origin/main HEAD passed, pre-commit/pre-push hooks passed, and live PR #8406 shows required checks green at this head. I did not rerun CI or aragora-merge-quorum.

This is independent model-review evidence only. It is not Tier 4 settlement, not branch-protection authorization, and not merge authorization.

…st-fallback-20260614

# Conflicts:
#	scripts/settle_tier4_pr.py
@scarmani

Copy link
Copy Markdown
Collaborator

Claude independent model review

Reviewer: claude (anthropic) - independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
Head: 6afec58 (6afec58), committed 2026-06-14T10:01:58Z.
PR: #8406.
Model family: claude
Model id: claude-opus-4-7[1m]
Receipt artifact: collect_quorum_evidence.py dry-run for PR #8406 at 6afec58 in Codex Desktop evidence pass
Files reviewed: scripts/settle_tier4_pr.py; aragora/cli/commands/review_queue_rest_fallback.py; tests/scripts/test_settle_tier4_pr.py
dogfood: yes

Verdict: PASS

  • [P3] scripts/settle_tier4_pr.py:1069 - STRICT_BASE_REQUIRED state name is misleading. It fires whenever strict is set AND either check-runs OR statuses fetch failed; it is really a fail-closed marker for "couldn't verify required checks under strict protection," not a literal "base branch behind" signal. Consider renaming to e.g. REQUIRED_FETCH_FAILED_STRICT or splitting the two failure modes so operators can disambiguate from the blocker text.
  • [P3] scripts/settle_tier4_pr.py:530 - _numeric_not_ready_allowed widens the allowed_not_ready set whenever the entry has requires_human_risk_settlement + requires_human_preapproval + repair_or_wait + draft, independent of whether the packet's canonical human_risk_settlement_required list contains the PR. The draft check downstream re-blocks today (test confirms), but the trust boundary is now "any entry that self-attests these four fields while draft" rather than "the packet says so". If a future code path consumes the same widened allowed_not_ready outside of this gate, this could become an authorization gap. Worth a one-line comment pinning the invariant.
  • [P3] scripts/settle_tier4_pr.py:541 - entry.get("requires_human_risk_settlement") etc. are truthy-checked via bool(...), so string "false" or "no" would still authorize. Low risk given producers are internal, but tightening to is True would be safer.

No blocking correctness, security, or regression issues identified. Tests cover both the draft-allowed diagnostic and non-draft-rejected paths, plus strict-protection fallback positive/negative cases.

Validation notes: I reviewed the current-head repair and existing validation reported for this head. I did not rerun CI, quorum, mark-ready, settlement, or merge.

This is independent model-review evidence only. It is not Tier 4 settlement, not Tier 4 preapproval, not mark-ready authorization, and not merge authorization.

@scarmani

Copy link
Copy Markdown
Collaborator

Grok independent model review

Reviewer: grok (xai) - independent adversarial model review via the Aragora Grok reviewer, grounded on the exact PR head.
Head: 6afec58 (6afec58), committed 2026-06-14T10:01:58Z.
PR: #8406.
Model family: grok
Receipt artifact: collect_quorum_evidence.py dry-run for PR #8406 at 6afec58 in Codex Desktop evidence pass
Files reviewed: current PR #8406 diff for scripts/settle_tier4_pr.py; aragora/cli/commands/review_queue_rest_fallback.py; tests/scripts/test_settle_tier4_pr.py
dogfood: yes

Verdict: PASS

  • No blocking issues found. Logic for draft numeric not_ready allowance and strict-protection REST fallback is narrowly scoped, well-tested, and preserves prior safety invariants.

Validation notes: I did not rerun CI, quorum, mark-ready, settlement, or merge in this evidence pass.

This is independent model-review evidence only. It is not Tier 4 settlement, not Tier 4 preapproval, not mark-ready authorization, and not merge authorization.

@scarmani scarmani marked this pull request as ready for review June 14, 2026 19:20
@scarmani scarmani self-requested a review as a code owner June 14, 2026 19:20
@github-actions

github-actions Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Aragora Code Review

Advisory-only review. No issues found.

@scarmani

Copy link
Copy Markdown
Collaborator

Tier-4 Human Settlement Authorization

PR: #8406
Exact head: 6afec58
Authorized action: admin_squash_merge and branch_protection_reconcile, only if #8406 is non-draft and live exact-head checks/merge-packet remain otherwise green.

Human-risk settlement: I accept the Tier 4 risk for this PR.

Treat the exact-head Tier 4 settlement comment plus trusted human-settlement status as the merge-packet preapproval artifact so settle_tier4_pr.py and merge-packet agree.

Co-authored-by: codex[bot] <codex[bot]@users.noreply.github.com>
@scarmani

Copy link
Copy Markdown
Collaborator

Claude independent model review

Reviewer: claude (anthropic) - independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
PR: #8406
Exact head SHA: 6ee264e
Committed date: 2026-06-15T02:21:41Z
Model family: claude
Receipt artifact: collect_quorum_evidence.py dry-run for PR #8406 at 6ee264e in Codex Desktop evidence pass
Files reviewed: aragora/cli/commands/review_queue.py; aragora/cli/commands/review_queue_rest_fallback.py; scripts/settle_tier4_pr.py; tests/scripts/test_settle_tier4_pr.py
dogfood: yes
Not merge authorization.

Findings-first verdict: SUPPORTIVE / PASS. Claude reported no blocking issues and no CHANGES-REQUESTED verdict. The reviewer included one P2-labeled policy-risk observation and explicitly framed it as non-blocking; the observation is preserved below.

Reviewer observations:

  • [P2] aragora/cli/commands/review_queue.py:3083-3110 - Mutating the input human_risk_settlement_recorded to True based purely on repo-visible status + creator pin is a deliberate policy loosening (previously required both local receipt AND repo signals; now repo signals + creator-verification alone suffice). Defensible because _human_settlement_status_creator_verified is the real security boundary in both paths, but worth confirming no downstream consumer treats human_risk_settlement_recorded=True as "local receipt was produced" rather than "trusted creator posted the status." If any downstream caller outside this hunk uses that flag as a stronger attestation than "repo-visible exact-head," it will now see the weaker form.
  • [P3] scripts/settle_tier4_pr.py:1069-1075 - STRICT_BASE_REQUIRED label is misleading: the early-return fires when protection.strict=True and there is a check-run or status-read error. The actual root cause for the operator is "we could not fetch check state under strict mode," not "base is behind." Fail-closed behavior is correct, but the gate message may confuse operators into rebasing when the right action is to retry or unblock GitHub API.
  • [P3] _numeric_not_ready_allowed in scripts/settle_tier4_pr.py:530-544 - Logic is correct and the new tests cover both draft-allowed and non-draft-unexpected paths. The draft-only carve-out cleanly demotes the message from "unexpected blockers" to "PR is draft" without changing gate verdict.
  • [P3] tests/scripts/test_settle_tier4_pr.py strict test test_rest_fallback_reports_strict_branch_protection_required_contexts does not mock the commits/{head}/check-runs endpoint and relies on upstream code to short-circuit before reaching it. Minor robustness nit only.
  • No security regression versus the old path: the trust anchor was already _trusted_settlement_creator() / creator-pin verification; this change removes a redundant local-receipt belt while keeping creator verification.
  • Tests adequately cover the new repo-visible-clears-preapproval path, the draft diagnostic, and the strict-protection fail-closed branch.

an0mium commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator Author

Flagging the review_queue.py preapproval hunk for reconsideration (governance / merge-authority).

This PR resolves the real settle_tier4_pr.py --check ↔ merge-packet divergence by loosening merge-packet: the repo-visible settlement comment + aragora/human-settlement status pair now clears human_preapproval_recorded on their own, dropping the local-receipt requirement.

The concern: review_queue.py:_has_recorded_human_risk_settlement documents the local receipt's purpose explicitly — "This path is read-only and exact head bound; it must not infer approval from GitHub comments alone." The repo-visible comment + status can both be posted by automation acting as a trusted creator (the settle helper's --settle-only does exactly this). The local receipt is deliberately the one signal automation cannot forge from GitHub alone. Dropping it lets a receipt-less automated settlement self-clear Tier-4 — the conflict-of-interest the human-preapproval gate exists to prevent.

The safer resolution is the mirror image: raise the lax tool to the strict requirement rather than lowering the strict tool to the lax one. I've implemented that in #8382 (commit 2) — settle_tier4_pr.py --check now refuses when human_preapproval_recorded=false, pointing operators at the existing atomic writer (review-queue record-settlement --action approve --post-github-status, which writes receipt + status together). The two tools then agree on the strict contract, and the receipt stays intact as the human gate.

Recommend dropping this PR's review_queue.py preapproval hunk while keeping its genuinely-good parts — the REST fallback for transport-blocked checks and the STRICT_BASE_REQUIRED fail-closed branch-protection handling are both real hardening and worth landing on their own.


Generated by Claude Code

@scarmani

Copy link
Copy Markdown
Collaborator

Claude independent model review

Reviewer: claude (anthropic) — independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
Head: c990c36 (c990c36), committed 2026-06-15T15:03:17Z.
PR: #8406.
Model family: claude

Verdict: PASS

Reviewed the diff against PR #8406 at head c990c36. The changes extract Tier-4 settlement trust helpers into a new module (review_queue_tier4_settlement.py) and tighten the trust model with three meaningful security improvements: (1) live /collaborators/{login}/permission admin check on top of GitHub authorAssociation, (2) target_url binding between the aragora/human-settlement status and the trusted comment URL, (3) comment freshness anchored to the head commit's committedDate. Fail-closed defaults are consistent throughout.

No blocking issues. Non-blocking observations:

  • [P3] aragora/cli/commands/review_queue.py:3146human_preapproval_recorded no longer requires the local human_risk_settlement_recorded receipt; repo-visible markers (status + trusted comment + admin pin + target_url binding) now suffice. This is a deliberate trust-model shift toward verifiable GitHub state and is well-tested (test_tier_four_repo_visible_helper_settlement_clears_preapproval), but worth flagging since it weakens the previous "both" requirement. Net security looks improved given the new admin-permission and target_url checks.
  • [P3] aragora/cli/commands/review_queue_tier4_settlement.py:225 and aragora/cli/commands/review_queue.py:_head_committed_at_from_pr — two near-duplicate _head_committed_at_from_pr implementations with slightly different fallback logic (the new one also tries commit.committer.date). Code duplication risk; consider consolidating.
  • [P3] tests/cli/test_cli_training.py — unrelated mock-path migration (aragora.cli.training.Xaragora.training.exporters.X) mixed into this PR. Doesn't affect correctness but makes review noisier.
  • [P3] aragora/cli/commands/review_queue_tier4_settlement.py:_trusted_comment_author_verifiedrepo_slug is interpolated into the API path without URL encoding, while login is quoted. Repo slugs in practice are safe alphanumerics/hyphens, so not exploitable, but inconsistent with the login handling.
  • [P3] scripts/settle_tier4_pr.py:_numeric_not_ready_allowed — adds a draft-only diagnostic path that lets evaluation proceed past the numeric "not_ready" check; the draft check downstream still blocks (confirmed by test_draft_tier4_preapproval_numeric_not_ready_is_diagnostic_only), but adds a new conditional surface to maintain.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator

OpenAI independent model review

Reviewer: openai (openai) — independent adversarial model review via Codex CLI OpenAI harness, grounded on the exact PR head.
Head: c990c36 (c990c36), committed 2026-06-15T15:03:17Z.
PR: #8406.
Model family: openai

Verdict: PASS

  • No blocking correctness, security, or regression issues found in the reviewed Tier 4 settlement changes.
  • Residual risk: the latest-comment selection depends on createdAt, not edit time, but that matches the existing freshness model and fails closed when required metadata is missing.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator

Tier-4 Human Settlement Authorization

PR: #8406
Exact head: c990c36
Authorized action: admin_squash_merge and branch_protection_reconcile, only if #8406 is non-draft and live exact-head checks/merge-packet remain otherwise green.

Human-risk settlement: I accept the Tier 4 risk for this PR.

@scarmani

Copy link
Copy Markdown
Collaborator

Claude independent model review

Reviewer: claude (anthropic) — independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
Head: 60abefa (60abefa), committed 2026-06-15T21:46:45Z.
PR: #8406.
Model family: claude

Verdict: PASS

No blocking issues. Several P2/P3 observations worth recording before merge:

  • [P2] aragora/cli/commands/review_queue.py _build_model_review_quorum: the new repo-visible-only path lets human_preapproval_recorded become True without a local receipt (local_human_risk_settlement_recorded no longer a precondition). The safety net is now entirely the trusted-creator + admin-permission + status↔comment target_url binding. That binding is solid as long as the ARAGORA_SETTLEMENT_CREATOR env var stays trustworthy — anyone able to set that env var on a runner can pin a different "trusted" creator. Worth re-affirming in the operator runbook that this env override is itself a Tier-4 secret.
  • [P2] scripts/settle_tier4_pr.py _status_signal_items: iteration order swapped to ("commitStatuses", "statusCheckRollup"). First-match-wins, so if both lists carry the same context with different states/creators, commitStatuses now wins (previously statusCheckRollup). Intentional (only commitStatuses carries creator/target_url), but it's a subtle behavior change that affects every other caller of _human_settlement_status_is_success — fine here, just flagging for awareness.
  • [P3] review_queue_tier4_settlement.py _human_settlement_status_creator_verified: returns on the FIRST aragora/human-settlement status encountered (GitHub's /statuses?per_page=100 returns newest first, so this is the newest — good). If per_page=100 is ever exceeded by churn on the head SHA, an older non-matching status could be missed. Realistic risk is near-zero (>100 statuses on one SHA would be pathological).
  • [P3] tests/cli/test_cli_training.py: drive-by patch-target fixups unrelated to Tier-4. Patches now target the modules aragora.cli.training imports lazily at call time (from aragora.training.exporters import … inside each function), so the new targets are the correct ones — but it's a bundled-in cleanup with no commit-level traceability to the PR's stated intent.
  • [P3] _post_human_settlement_status correctly fails closed when target_url is empty; chain of trust through _trusted_recorded_settlement_comment_url (re-verifies live head before posting) is good.

Tests added cover the new binding (target_url mismatch → block, creator mismatch → block, bound-and-trusted → authorize, draft preapproval → diagnostic-only) — coverage looks adequate for the security-critical paths introduced.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator

OpenAI independent model review

Reviewer: openai (openai) — independent adversarial model review via Codex CLI OpenAI harness, grounded on the exact PR head.
Head: 60abefa (60abefa), committed 2026-06-15T21:46:45Z.
PR: #8406.
Model family: openai

Verdict: PASS

  • No blocking correctness, security, or regression issues found in the supplied diff.
  • The Tier 4 settlement changes now fail closed on missing/untrusted comment binding, stale head, mismatched status target URL, and untrusted status creator, which addresses the main bypass surfaces I checked.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator

Tier-4 Human Settlement Authorization

PR: #8406
Exact head: 60abefa
Authorized action: admin_squash_merge and branch_protection_reconcile, only if #8406 is non-draft and live exact-head checks/merge-packet remain otherwise green.

Human-risk settlement: I accept the Tier 4 risk for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants