diff --git a/.github/workflows/crane.md b/.github/workflows/crane.md index 8c229044..b325404b 100644 --- a/.github/workflows/crane.md +++ b/.github/workflows/crane.md @@ -180,7 +180,7 @@ The pre-step fetches open issues with the `crane-migration` label via the GitHub When a migration is issue-based, `/tmp/gh-aw/crane.json` includes: - **`selected_issue`**: The issue number (e.g., `42`) if the selected migration came from an issue, or `null` if it came from a file. - **`issue_migrations`**: A mapping of migration name -> issue number for all issue-based migrations found. -- **`stale_completed_state`**: Issue-based migrations whose issue still has `crane-migration` even though repo-memory says `Completed: true`. +- **`stale_completed_state`**: Issue-based migrations whose completion state is not trustworthy. This includes active issues that still have `crane-migration` even though repo-memory says `Completed: true`, and completed-label issues whose current PR-head completion gate did not pass. ### Reading Migrations @@ -194,7 +194,7 @@ The pre-step has already determined which migration to run. Read `/tmp/gh-aw/cra - **`selected_strategy`**: The `strategy` value from the migration's frontmatter -- one of `"in-place"`, `"greenfield"`, or `"auto"`. If `"auto"`, the agent must pick on the first iteration and write the chosen strategy back into the state file's Machine State table. - **`state_file_size_bytes`** / **`state_file_max_bytes`**: For rolling-compaction decisions (see [Update Rules](#update-rules)). - **`issue_migrations`**: A mapping of migration name -> issue number for all discovered issue-based migrations. -- **`stale_completed_state`**: A list of active issue-based migrations where repo-memory still says `Completed: true`; treat this as stale memory, not as permission to finish. +- **`stale_completed_state`**: A list of issue-based migrations where repo-memory still says `Completed: true`, but the issue label state or PR-head completion gate says the migration is not safely complete; treat this as stale memory, not as permission to finish. - **`deferred`**: Other migrations that were due but will be handled in future runs. - **`unconfigured`**: Migrations that still have the sentinel or placeholder content. - **`skipped`**: Migrations not due yet based on their per-migration schedule, or completed/paused. @@ -209,7 +209,8 @@ If `selected` is not null: 3. Read the current state of all source and target paths. 4. Read the state file `{selected}.md` from the repo-memory folder. This contains the Machine State table, the Migration Plan, lessons, blockers, and iteration history. 5. If `selected_issue` is not null, also read the issue comments for any human steering input. -6. If `selected` appears in `stale_completed_state`, ignore any pre-existing `Completed: true`, `Completed Reason`, or target-satisfying `best_metric` as a completion signal. First run the current verification contract and only re-complete after a fresh accepted iteration satisfies the current halting condition. +6. If `selected` appears in `stale_completed_state`, ignore any pre-existing `Completed: true`, `Completed Reason`, target-satisfying `best_metric`, or `crane-completed` label as a completion signal. Restore the issue to active migration state by ensuring `crane-migration` is present and `crane-completed` is absent unless the deterministic completion gate passes later in this run. First run the current verification contract and only enter the completion-candidate path after a fresh accepted iteration satisfies the target metric. +7. Before making code changes, inspect the state file's `Completion Candidate` field. If it is `true`, run the deterministic completion gate in [Halting Condition](#halting-condition). Only finalize completion if the current PR head checks pass. If the gate is failing, fix the failing PR checks instead of starting unrelated migration work. If the gate is pending or unavailable, leave the issue active and update `Completion Gate Status`. ## Multiple Migrations @@ -549,7 +550,7 @@ If `status == "failure"`, **fix and retry -- do not revert, do not accept**: - Update **[docs] Lessons Learned** if this iteration revealed something new (e.g. a bridging trick, a parity surprise, a perf trap). - Update **[scope] Future Work** if this iteration opened new threads. 5. **Update the migration issue**: edit the status comment and post a per-iteration comment using the same shared accepted iteration summary. -6. **Check halting condition** (see [Halting Condition](#halting-condition)): if `target-metric` is set, compare the new `best_metric` from this accepted iteration against it. For `higher` direction: completed when `best_metric >= target-metric`. Never mark completed from stored `best_metric` alone or a pre-existing `Completed: true`; completion requires the current accepted iteration's fresh verification result. When the target is met, mark the migration as completed. +6. **Check halting condition** (see [Halting Condition](#halting-condition)): if `target-metric` is set, compare the new `best_metric` from this accepted iteration against it. For `higher` direction, the target is reached when `best_metric >= target-metric`. Reaching the target metric does **not** complete the migration in this run. It creates a completion candidate: set `Completion Candidate: true`, set `Completion Gate: pr-head-checks`, set `Completion Gate Status: pending`, keep `Completed: false`, keep `Completed Reason: --`, and leave the `crane-migration` label on the issue. Completion is finalized only by a later run after the pushed PR head's deterministic checks are observed green. **If the score did not improve**: 1. Discard the code changes (do not commit them to the long-running branch). @@ -652,29 +653,46 @@ After **every iteration** (accepted, rejected, or error), post a **new comment** - For issue-based migrations, the source issue body IS the migration definition -- do not modify it (the user owns it). - For file-based migrations, the migration issue body is informational and may be lightly updated, but the migration file (`migration.md`) remains the source of truth. - The `crane-migration` label must remain on the issue for the migration to be discovered. When a migration completes, the label is removed and replaced with `crane-completed`. -- If an issue has `crane-migration` but repo-memory says `Completed: true`, the active label wins. Treat the completed state as stale until the current verification contract passes in a fresh accepted iteration. +- If an issue has `crane-migration` but repo-memory says `Completed: true`, the active label wins. Treat the completed state as stale until the current verification contract passes in a fresh accepted iteration and the deterministic completion gate passes on the pushed PR head. +- If an issue has `crane-completed` but the scheduler selected it in `stale_completed_state`, the completed label is stale. Restore `crane-migration`, remove `crane-completed`, and work the migration until the deterministic completion gate passes. - Closing the migration issue stops the migration from being discovered. Do NOT close the migration issue when the PR is merged -- the branch continues to accumulate future iterations until the target metric is reached. - Migration issues are labeled `[crane-migration, automation, crane]`. ## Halting Condition -Migrations are usually **goal-oriented** -- you want to finish. Set `target-metric: 1.0` in the frontmatter and Crane stops the migration when the health score reaches 1.0 (which, with the recommended `correctness x progress` convention, means "fully migrated and verified"). +Migrations are usually **goal-oriented** -- you want to finish. Set `target-metric: 1.0` in the frontmatter to nominate a migration for completion once the health score reaches 1.0 (which, with the recommended `correctness x progress` convention, means "fully migrated and verified"). The metric is necessary but not sufficient: final completion always requires a deterministic PR-head completion gate. ### How It Works 1. Parse the `target-metric` value from the migration's YAML frontmatter (if present). 2. After each **accepted** iteration, compare the new `best_metric` from that iteration against the `target-metric`. -3. For `higher` direction (default): completed when `best_metric >= target-metric`. -4. For `lower` direction: completed when `best_metric <= target-metric`. -5. When completed: +3. For `higher` direction (default): the target is reached when `best_metric >= target-metric`. +4. For `lower` direction: the target is reached when `best_metric <= target-metric`. +5. When the target is reached, create a completion candidate instead of completing immediately: + - Set `Completion Candidate: true`. + - Set `Completion Gate: pr-head-checks`. + - Set `Completion Gate Status: pending`. + - Keep `Completed: false` and `Completed Reason: --`. + - For issue-based migrations, keep the `crane-migration` label and do not add `crane-completed`. + - Update the status comment to Active with a note that the migration is waiting on deterministic PR-head checks. +6. On the next run while `Completion Candidate: true`, run the deterministic completion gate before making code changes: + - Resolve the migration PR from `existing_pr` or the state file's `PR` field. + - Query the PR's current head SHA via the GitHub API. + - Query check-runs/check-suites for that exact PR head SHA, or use `gh pr checks "$PR" --json name,conclusion,state,startedAt,completedAt`. + - The gate passes only if every check for the current PR head is terminal success. Treat missing checks, pending checks, queued checks, failing checks, cancelled checks, timed-out checks, stale checks, and action-required checks as not passing. + - If the gate is pending or missing, leave `Completion Candidate: true`, keep `Completed: false`, ensure `crane-migration` is present, ensure `crane-completed` is absent, set `Completion Gate Status: pending:`, and end without completing. + - If the gate fails, leave `Completion Candidate: true`, keep `Completed: false`, ensure `crane-migration` is present, ensure `crane-completed` is absent, set `Completion Gate Status: failing:`, and fix the failing PR checks. +7. Only when the deterministic completion gate passes: - Set `Completed: true` in the Machine State table. - - Set `Completed Reason` to a human-readable message (e.g., `target metric 1.0 reached with value 1.0`). + - Set `Completed Reason` to a human-readable message that includes both the target metric and PR-head check evidence (e.g., `target metric 1.0 reached; PR #123 head abc1234 checks passed`). + - Set `Completion Candidate: false`. + - Set `Completion Gate Status: passed:`. - **For issue-based migrations**: remove the `crane-migration` label, add the `crane-completed` label. - Update the status comment to [+] Completed. - Post a celebratory per-iteration comment: `[+] **Migration complete!** {source} -> {target} finished after {N} iterations.` - The migration will not be selected for future runs. -Do not enter this path from repo-memory alone. A stored `Completed: true`, old `Completed Reason`, or historical `best_metric` is only evidence about a previous run; the current run must produce and accept a verification score that satisfies the current migration definition. +Do not enter final completion from repo-memory alone. A stored `Completed: true`, old `Completed Reason`, historical `best_metric`, or same-run sandbox score is only evidence about a previous or local run. Final completion requires deterministic evidence from the current PR head after safe outputs have pushed the branch and GitHub Actions has reported the head checks. ### Open-Ended Migrations @@ -724,6 +742,9 @@ When creating or updating a migration's state file, use this structure: | Pause Reason | -- | | Completed | false | | Completed Reason | -- | +| Completion Candidate | false | +| Completion Gate | pr-head-checks | +| Completion Gate Status | -- | | Consecutive Errors | 0 | | Recent Statuses | -- | @@ -820,8 +841,11 @@ All iterations in reverse chronological order (newest first). | Issue | `#number` or `--` | Single migration issue | | Paused | `true` or `false` | Whether the migration is paused | | Pause Reason | text or `--` | `manual`, `consecutive errors`, `ci-fix-exhausted: `, `stuck in CI fix loop: `, `ci-timeout` | -| Completed | `true` or `false` | Whether the target metric has been reached | -| Completed Reason | text or `--` | e.g., `target metric 1.0 reached with value 1.0` | +| Completed | `true` or `false` | Whether the deterministic completion gate has passed and the migration is final | +| Completed Reason | text or `--` | e.g., `target metric 1.0 reached; PR #123 head abc1234 checks passed` | +| Completion Candidate | `true` or `false` | Whether the target metric has been reached and the migration is waiting for deterministic PR-head checks | +| Completion Gate | text | Deterministic gate required for final completion. Default: `pr-head-checks` | +| Completion Gate Status | text or `--` | Latest gate result, such as `pending:`, `failing:`, or `passed:` | | Consecutive Errors | integer | Count of consecutive verification failures | | Recent Statuses | comma-separated | Last 10 outcomes: `accepted`, `rejected`, `error`, or `ci-fix-exhausted` | diff --git a/.github/workflows/scripts/crane_scheduler.py b/.github/workflows/scripts/crane_scheduler.py index 40648151..41929c5d 100644 --- a/.github/workflows/scripts/crane_scheduler.py +++ b/.github/workflows/scripts/crane_scheduler.py @@ -337,34 +337,59 @@ def _scan_bare_migrations(): return sorted(glob.glob(os.path.join(MIGRATIONS_DIR, "*.md"))) -def _fetch_issue_migrations(repo, github_token): - """Fetch open issues with the ``crane-migration`` label and write their - bodies to ``ISSUE_MIGRATIONS_DIR``. Returns ``(migration_files, issue_migrations)``. +def _issue_has_label(issue, label): + """Return True if a GitHub issue payload contains ``label``.""" + for raw_label in issue.get("labels") or []: + if isinstance(raw_label, dict): + name = raw_label.get("name") + else: + name = raw_label + if name == label: + return True + return False - Errors are swallowed (with a warning) so a transient API failure doesn't - block the run for non-issue-based migrations. - """ - migration_files = [] - issue_migrations = {} - os.makedirs(ISSUE_MIGRATIONS_DIR, exist_ok=True) + +def _fetch_open_issues_with_label(repo, github_token, label): + """Fetch open issues with one label. Returns issue payloads.""" next_url = ( "https://api.github.com/repos/{}/issues" - "?labels=crane-migration&state=open&per_page=100".format(repo) + "?labels={}&state=open&per_page=100".format(repo, urllib.parse.quote(label, safe="")) ) headers = { "Authorization": "token {}".format(github_token), "Accept": "application/vnd.github.v3+json", } issues = [] + while next_url: + req = urllib.request.Request(next_url, headers=headers) + with urllib.request.urlopen(req, timeout=30) as resp: + page = json.loads(resp.read().decode()) + link_header = resp.headers.get("link") or resp.headers.get("Link") + issues.extend(page) + next_url = parse_link_header(link_header) + return issues + + +def _fetch_issue_migrations(repo, github_token): + """Fetch open Crane-labelled issues and write their bodies to + ``ISSUE_MIGRATIONS_DIR``. Returns ``(migration_files, issue_migrations)``. + + Crane primarily runs issues with ``crane-migration``. It also reads open + ``crane-completed`` issues so a stale completion can be recovered when the + deterministic PR-head gate did not pass. + + Errors are swallowed (with a warning) so a transient API failure doesn't + block the run for non-issue-based migrations. + """ + migration_files = [] + issue_migrations = {} + os.makedirs(ISSUE_MIGRATIONS_DIR, exist_ok=True) try: - while next_url: - req = urllib.request.Request(next_url, headers=headers) - with urllib.request.urlopen(req, timeout=30) as resp: - page = json.loads(resp.read().decode()) - link_header = resp.headers.get("link") or resp.headers.get("Link") - issues.extend(page) - next_url = parse_link_header(link_header) - for issue in issues: + issues_by_number = {} + for label in ("crane-migration", "crane-completed"): + for issue in _fetch_open_issues_with_label(repo, github_token, label): + issues_by_number[issue.get("number")] = issue + for issue in issues_by_number.values(): if issue.get("pull_request"): continue # skip PRs body = issue.get("body") or "" @@ -383,8 +408,21 @@ def _fetch_issue_migrations(repo, github_token): with open(issue_file, "w") as f: f.write(body) migration_files.append(issue_file) - issue_migrations[slug] = {"issue_number": number, "file": issue_file, "title": title} - print(" Found issue-based migration: '{}' (issue #{})".format(slug, number)) + active = _issue_has_label(issue, "crane-migration") + completed_label = _issue_has_label(issue, "crane-completed") + issue_migrations[slug] = { + "issue_number": number, + "file": issue_file, + "title": title, + "active": active, + "completed_label": completed_label, + } + label_status = "active" if active else "completed-label" + print( + " Found issue-based migration: '{}' (issue #{}, {})".format( + slug, number, label_status + ) + ) except Exception as e: # noqa: BLE001 -- best-effort; logged below print(" Warning: could not fetch issue-based migrations: {}".format(e)) return migration_files, issue_migrations @@ -484,6 +522,62 @@ def find_existing_pr_for_branch(repo, migration_name, github_token, http_get_jso return None +def get_pr_head_check_gate(repo, pr_number, github_token, http_get_json=_http_get_json): + """Return ``(passed, reason)`` for the deterministic PR-head check gate. + + ``passed`` is: + True - the current PR head has at least one check run and all are success + False - the PR exists, but checks are missing, pending, or failing + None - the API could not provide enough data to decide + """ + if not repo or not pr_number or not github_token: + return None, "missing-pr-or-token" + headers = { + "Authorization": "token {}".format(github_token), + "Accept": "application/vnd.github.v3+json", + } + pr_url = "https://api.github.com/repos/{}/pulls/{}".format(repo, pr_number) + pr_body, _ = http_get_json(pr_url, headers) + if not isinstance(pr_body, dict): + return None, "pr-unavailable" + head = pr_body.get("head") or {} + head_sha = head.get("sha") if isinstance(head, dict) else None + if not head_sha: + return None, "pr-head-sha-unavailable" + + check_runs = [] + next_url = "https://api.github.com/repos/{}/commits/{}/check-runs?per_page=100".format( + repo, head_sha + ) + while next_url: + body, link_header = http_get_json(next_url, headers) + if not isinstance(body, dict): + return None, "checks-unavailable:{}".format(head_sha[:12]) + page_runs = body.get("check_runs") + if not isinstance(page_runs, list): + return None, "checks-malformed:{}".format(head_sha[:12]) + check_runs.extend(page_runs) + next_url = parse_link_header(link_header) + + if not check_runs: + return False, "missing-checks:{}".format(head_sha[:12]) + + not_success = [] + for run in check_runs: + if not isinstance(run, dict): + not_success.append("unknown:malformed") + continue + name = run.get("name") or "unknown" + status = run.get("status") + conclusion = run.get("conclusion") + if status != "completed" or conclusion != "success": + not_success.append("{}:{}:{}".format(name, status or "unknown", conclusion or "none")) + + if not_success: + return False, "failing:{}:{}".format(head_sha[:12], ";".join(not_success[:5])) + return True, "passed:{}".format(head_sha[:12]) + + # --------------------------------------------------------------------------- # Selection # --------------------------------------------------------------------------- @@ -629,7 +723,9 @@ def main(): for pf in migration_files: name = get_migration_name(pf) - issue_active = name in issue_migrations + issue_info = issue_migrations.get(name) or {} + issue_active = bool(issue_info.get("active")) + issue_completed_label = bool(issue_info.get("completed_label")) all_migrations[name] = pf with open(pf) as f: content = f.read() @@ -670,12 +766,51 @@ def main(): print(" {}: no state found (first run)".format(name)) has_stale_completed_state = issue_active and is_completed_state(state) + recovered_completed_issue = False + if issue_completed_label and is_completed_state(state) and not issue_active: + existing_pr_for_recovery = find_existing_pr_for_branch(repo, name, github_token) + if existing_pr_for_recovery: + gate_passed, gate_reason = get_pr_head_check_gate( + repo, existing_pr_for_recovery, github_token + ) + if gate_passed is False: + has_stale_completed_state = True + recovered_completed_issue = True + print( + " {}: crane-completed label is stale; PR #{} gate is {}".format( + name, existing_pr_for_recovery, gate_reason + ) + ) + elif gate_passed is True: + print( + " {}: crane-completed label confirmed by PR #{} gate {}".format( + name, existing_pr_for_recovery, gate_reason + ) + ) + else: + print( + " {}: could not evaluate completed-label recovery for PR #{} ({})".format( + name, existing_pr_for_recovery, gate_reason + ) + ) + else: + print( + " {}: crane-completed label present, but no open migration PR was found".format( + name + ) + ) if has_stale_completed_state: stale_completed_state.append(name) - print( - f" {name}: issue still has crane-migration label; treating " - "Completed=true as stale until fresh verification passes" - ) + if issue_active: + print( + f" {name}: issue still has crane-migration label; treating " + "Completed=true as stale until fresh verification passes" + ) + if recovered_completed_issue: + print( + f" {name}: completed label will be treated as stale until " + "the deterministic completion gate passes" + ) last_run = None lr = state.get("last_run") @@ -685,11 +820,26 @@ def main(): except ValueError: pass - should_skip, reason = check_skip_conditions(state, issue_active=issue_active) + should_skip, reason = check_skip_conditions( + state, + issue_active=issue_active or recovered_completed_issue, + ) if should_skip: skipped.append({"name": name, "reason": reason}) continue + if has_stale_completed_state: + due.append({ + "name": name, + "last_run": lr, + "file": pf, + "target_metric": target_metric, + "metric_direction": metric_direction, + "strategy": strategy, + "stale_completed_state": has_stale_completed_state, + }) + continue + # Check if due based on per-migration schedule if schedule_delta and last_run and now - last_run < schedule_delta: skipped.append( diff --git a/tests/unit/test_crane_scheduler.py b/tests/unit/test_crane_scheduler.py index 55c373a5..13ef75c7 100644 --- a/tests/unit/test_crane_scheduler.py +++ b/tests/unit/test_crane_scheduler.py @@ -42,3 +42,62 @@ def test_active_issue_does_not_override_pause() -> None: def test_machine_state_completed_string_is_recognized() -> None: assert crane_scheduler.is_completed_state({"completed": "true"}) is True + + +def test_issue_label_detection_accepts_github_label_payloads() -> None: + issue = {"labels": [{"name": "crane-completed"}, "automation"]} + + assert crane_scheduler._issue_has_label(issue, "crane-completed") is True + assert crane_scheduler._issue_has_label(issue, "automation") is True + assert crane_scheduler._issue_has_label(issue, "crane-migration") is False + + +def test_pr_head_gate_fails_when_any_check_is_not_success() -> None: + def fake_http_get_json(url, _headers, timeout=30): + del timeout + if url.endswith("/pulls/102"): + return {"head": {"sha": "abcdef1234567890"}}, None + if "/check-runs" in url: + return { + "check_runs": [ + {"name": "Go Tests", "status": "completed", "conclusion": "success"}, + {"name": "Parity Gate", "status": "completed", "conclusion": "failure"}, + ] + }, None + raise AssertionError(f"unexpected URL: {url}") + + passed, reason = crane_scheduler.get_pr_head_check_gate( + "githubnext/apm", + 102, + "token", + http_get_json=fake_http_get_json, + ) + + assert passed is False + assert reason.startswith("failing:abcdef123456") + assert "Parity Gate:completed:failure" in reason + + +def test_pr_head_gate_passes_only_when_all_checks_succeed() -> None: + def fake_http_get_json(url, _headers, timeout=30): + del timeout + if url.endswith("/pulls/102"): + return {"head": {"sha": "abcdef1234567890"}}, None + if "/check-runs" in url: + return { + "check_runs": [ + {"name": "Go Tests", "status": "completed", "conclusion": "success"}, + {"name": "Parity Gate", "status": "completed", "conclusion": "success"}, + ] + }, None + raise AssertionError(f"unexpected URL: {url}") + + passed, reason = crane_scheduler.get_pr_head_check_gate( + "githubnext/apm", + 102, + "token", + http_get_json=fake_http_get_json, + ) + + assert passed is True + assert reason == "passed:abcdef123456" diff --git a/tests/unit/test_crane_workflow_prompt.py b/tests/unit/test_crane_workflow_prompt.py index c88afdc3..f7f7c48e 100644 --- a/tests/unit/test_crane_workflow_prompt.py +++ b/tests/unit/test_crane_workflow_prompt.py @@ -36,5 +36,29 @@ def test_crane_prompt_blocks_stale_completed_state_from_finishing() -> None: assert "stale_completed_state" in text assert "active label wins" in text - assert "Never mark completed from stored `best_metric` alone" in text - assert "the current run must produce and accept a verification score" in text + assert "ignore any pre-existing `Completed: true`" in text + assert "Restore the issue to active migration state" in text + assert "`crane-completed` but the scheduler selected it in `stale_completed_state`" in text + assert "deterministic completion gate passes on the pushed PR head" in text + assert "same-run sandbox score is only evidence" in text + + +def test_crane_completion_is_two_phase_and_pr_head_gated() -> None: + text = _workflow_text() + + assert "Reaching the target metric does **not** complete the migration in this run" in text + assert "Completion Candidate: true" in text + assert "Completion Gate: pr-head-checks" in text + assert "leave the `crane-migration` label on the issue" in text + assert "current PR head checks pass" in text + assert "every check for the current PR head is terminal success" in text + assert "Completion Gate Status: passed:" in text + + +def test_crane_state_template_tracks_completion_candidate_gate() -> None: + text = _workflow_text() + + assert "| Completion Candidate | false |" in text + assert "| Completion Gate | pr-head-checks |" in text + assert "| Completion Gate Status | -- |" in text + assert "Whether the target metric has been reached and the migration is waiting" in text