diff --git a/instructions/night-watch-pr-reviewer.md b/instructions/night-watch-pr-reviewer.md index 3b5e95ac..6d493372 100644 --- a/instructions/night-watch-pr-reviewer.md +++ b/instructions/night-watch-pr-reviewer.md @@ -1,8 +1,9 @@ -You are the Night Watch PR Reviewer agent. Your job is to check open PRs for three things: +You are the Night Watch PR Reviewer agent. Your job is to implement a **review-first, fix-later** workflow: -1. Merge conflicts -- rebase onto the base branch and resolve them. -2. Review comments with a score below 80 -- address the feedback. -3. Failed CI jobs -- diagnose and fix the failures. +1. **No review yet** → Post a review (score the PR), exit without fixing +2. **Review exists, score < threshold** → Fix ALL flagged issues (bugs, code quality, performance, CI, merge conflicts), push, exit +3. **After fixing** → Exit. Next scheduled run (or GH Actions on push) re-scores +4. **Score >= threshold** → Skip (unchanged) ## Context @@ -10,8 +11,6 @@ The repo can have multiple PR checks/workflows (project CI plus Night Watch auto Common examples include `typecheck`, `lint`, `test`, `build`, `verify`, `executor`, `qa`, and `audit`. Treat `gh pr checks --json name,state,conclusion` as the source of truth for which checks failed. -A PR needs attention if **any** of the following: merge conflicts present, review score below 80, or any CI job failed. - ## PRD Context The cron wrapper may append a `## PRD Context` section with linked issue bodies and/or PRD file excerpts. @@ -21,8 +20,7 @@ If current PR code or review feedback conflicts with the PRD context, call out t ## Important: Early Exit - If there are **no open PRs** on `night-watch/` or `feat/` branches, **stop immediately** and report "No PRs to review." -- If all open PRs have **no merge conflicts**, **passing CI**, and **review score >= 80**, **stop immediately** and report "All PRs are in good shape." -- If a PR has no review score yet, it needs a first review — do NOT skip it. +- If all open PRs have **review score >= threshold** (or no review yet - you'll post one), **stop immediately** after processing. - Do **NOT** loop or retry. Process each PR **once** per run. After processing all PRs, stop. - Do **NOT** re-check PRs after pushing fixes -- the CI will re-run automatically on the next push. @@ -36,223 +34,188 @@ If current PR code or review feedback conflicts with the PRD context, call out t Filter for PRs on `night-watch/` or `feat/` branches. -2. **For each PR**, check three things: +2. **For each PR**, determine the next action based on the **review-first, fix-later** flow: + +### Step A: Check Review Status -### A. Check for Merge Conflicts +Fetch the **comments** (the bot posts as a regular issue comment): ``` -gh pr view --json mergeStateStatus --jq '.mergeStateStatus' +gh pr view --json comments --jq '.comments[].body' ``` -If the result is `DIRTY` or `CONFLICTING`, the PR has merge conflicts that **must** be resolved before anything else. - -### B. Check CI Status - -Fetch the CI check status for the PR: +If that returns nothing, also try: ``` -gh pr checks --json name,state,conclusion +gh api repos/{owner}/{repo}/issues//comments --jq '.[].body' ``` -If any check has `conclusion` of `failure` (or `state` is not `completed`/`success`), the PR has CI failures that need fixing. +Parse the review score from the comment body. Look for patterns like: -To get details on why a CI job failed, fetch the workflow run logs: +- `**Overall Score:** XX/100` +- `**Score:** XX/100` +- `Overall Score:** XX/100` -``` -gh run list --branch --limit 1 --json databaseId,conclusion,status -``` +Extract the numeric score. If multiple comments have scores, use the **most recent** one. -Then view the failed job logs: +### Step B: Determine Action Based on Review Status -``` -gh run view --log-failed -``` +**Case 1: No review yet** → **REVIEW MODE** (post a review, don't fix) +- Exit early without fixing anything +- The GitHub Actions workflow will post a review automatically +- Log: `No review yet for PR #, exiting review-first, fix-later flow early` -### C. Check Review Score +**Case 2: Review exists, score >= threshold** → **SKIP** (PR is in good shape) +- Log: `PR # review score >= threshold , skipping` +- Continue to next PR -Fetch the **comments** (NOT reviews -- the bot posts as a regular issue comment): +**Case 3: Review exists, score < threshold** → **FIX MODE** (fix all issues) +- Continue to Step C to fix ALL flagged issues -``` -gh pr view --json comments --jq '.comments[].body' -``` +### Step C: Fix ALL Flagged Issues (when review score < threshold) -If that returns nothing, also try: +When fixing, address issues in **priority order**: -``` -gh api repos/{owner}/{repo}/issues//comments --jq '.[].body' -``` +1. **CI failures** (highest priority) - failing checks block everything +2. **Merge conflicts** - must be resolved before merging +3. **Critical bugs** - crashes, data loss, security vulnerabilities +4. **Code quality issues** - error handling, edge cases, maintainability +5. **Performance issues** - inefficiencies, slow operations +6. **Test coverage** - missing tests, inadequate coverage +7. **Documentation** - unclear comments, missing docs -Parse the review score from the comment body. Look for patterns like: +#### C.1: Check Out the PR Branch -- `**Overall Score:** XX/100` -- `**Score:** XX/100` -- `Overall Score:** XX/100` - Extract the numeric score. If multiple comments have scores, use the **most recent** one. +Use the current runner worktree and check out the PR branch (do **not** create additional worktrees): -3. **Determine if PR needs work**: - - If no merge conflicts **AND** score >= 80 **AND** all CI checks pass --> skip this PR. - - If **no review score exists yet** --> this PR needs its first review (see Mode: Review below). - - If merge conflicts present **OR** score < 80 **OR** any CI check failed --> fix the issues (see Mode: Fix below). +``` +git fetch origin +git checkout +git pull origin +``` -## Mode: Review (when no review score exists yet) +The reviewer cron wrapper already runs you inside an isolated worktree and performs cleanup. +Stay in the current directory and run package install (npm install, yarn install, or pnpm install as appropriate). -When a PR has no review score, post an initial review instead of fixing issues: +#### C.2: Resolve Merge Conflicts -1. Fetch the PR diff: `gh pr diff ` -2. Review the code using these criteria: - - **Correctness**: logic errors, edge cases, off-by-one errors - - **Code quality**: readability, naming, dead code, complexity - - **Tests**: missing tests, inadequate coverage - - **Performance**: obvious bottlenecks, unnecessary work - - **Security**: injection, XSS, secrets in code, unsafe patterns - - **Conventions**: follows project CLAUDE.md / coding standards -3. Post a review comment in this exact format (so the score can be parsed): +Check if merge conflicts exist: ``` -gh pr comment --body "## PR Review - -### Summary -<1-2 sentence summary of what the PR does> +gh pr view --json mergeStateStatus --jq '.mergeStateStatus' +``` -### Issues Found +If the result is `DIRTY` or `CONFLICTING`: +- Get the base branch: `gh pr view --json baseRefName --jq '.baseRefName'` +- Rebase the PR branch onto the latest base branch: + ``` + git fetch origin + git rebase origin/ + ``` +- For each conflicted file, examine the conflict markers carefully. Preserve the PR's intended changes while incorporating upstream updates. Resolve each conflict, then stage it: + ``` + git add + ``` +- Continue the rebase: `git rebase --continue` +- Repeat until the rebase completes without conflicts. +- Push the clean branch: `git push --force-with-lease origin ` +- **Do NOT leave any conflict markers (`<<<<<<<`, `=======`, `>>>>>>>`) in any file.** + +#### C.3: Fix CI Failures + +Check CI status and identify failing checks: -| Category | Confidence | Issue | -|----------|-----------|-------| -| | High/Medium/Low | | +``` +gh pr checks --json name,state,conclusion +``` -### Strengths -- +Filter for checks with `conclusion` of `failure`. -**Overall Score:** /100 +To get details on why a CI job failed: -> 🔍 Reviewed by " +``` +RUN_ID=$(gh run list --branch --limit 1 --json databaseId --jq '.[0].databaseId') +gh run view "${RUN_ID}" --log-failed ``` -4. **Do NOT fix anything** — just post the review and stop. The next reviewer run will address the issues. +Fix checks based on their actual names and errors (for example: `typecheck`, `lint`, `test`, `build`, `verify`, `executor`, `qa`, `audit`). -## Mode: Fix (when review score < threshold) +#### C.4: Address Review Feedback -When the cron script injects `- action: fix` in the ## Target Scope section, follow the fix steps in section 4 below. Read the injected review body from `## Latest Review Feedback` to know what to address. +Read the review comments carefully. Extract actionable fix items: -4. **Fix the PR**: +Look for categories like: +- **Bugs**: "bug found", "error", "crash", "incorrect" +- **Code quality**: "unclear", "hard to read", "should be", "missing" +- **Performance**: "slow", "inefficient", "N+1", "memory" +- **Security**: "vulnerability", "injection", "sanitize" +- **Testing**: "missing test", "no coverage", "untested" - a. **Use the current runner worktree** and check out the PR branch (do **not** create additional worktrees): +For each issue: +- If you agree, implement the fix +- If you disagree, note the technical reason for the PR comment - ``` - git fetch origin - git checkout - git pull origin - ``` +#### C.5: Run Verification - The reviewer cron wrapper already runs you inside an isolated worktree and performs cleanup. - Stay in the current directory and run package install (npm install, yarn install, or pnpm install as appropriate). - - b. **Resolve merge conflicts** (if `mergeStateStatus` was `DIRTY` or `CONFLICTING`): - - Get the base branch: `gh pr view --json baseRefName --jq '.baseRefName'` - - Rebase the PR branch onto the latest base branch: - ``` - git fetch origin - git rebase origin/ - ``` - - For each conflicted file, examine the conflict markers carefully. Preserve the PR's intended changes while incorporating upstream updates. Resolve each conflict, then stage it: - ``` - git add - ``` - - Continue the rebase: `git rebase --continue` - - Repeat until the rebase completes without conflicts. - - Push the clean branch: `git push --force-with-lease origin ` - - **Do NOT leave any conflict markers (`<<<<<<<`, `=======`, `>>>>>>>`) in any file.** - - c. **Address review feedback** (if score < 80): - - Read the review comments carefully. Extract areas for improvement, bugs found, issues found, and specific file/line suggestions. - - For each review suggestion: - - If you agree, implement the change. - - If you do not agree, do not implement it blindly. Capture a short technical reason and include that reason in the PR comment. - - Fix bugs identified. - - Improve error handling if flagged. - - Add missing tests if coverage was noted. - - Refactor code if structure was criticized. - - Follow all project conventions from AI assistant documentation files (e.g., CLAUDE.md, AGENTS.md, or similar). - - d. **Address CI failures** (if any): - - Check CI status and identify non-passing checks: - ``` - gh pr checks --json name,state,conclusion - ``` - - First enumerate all checks/jobs from GitHub (source of truth): - ``` - gh pr checks --json name,state,conclusion --jq '.[] | [.name, .state, .conclusion] | @tsv' - ``` - - To inspect the latest workflow run's job list in detail: - ``` - RUN_ID=$(gh run list --branch --limit 1 --json databaseId --jq '.[0].databaseId') - gh run view "${RUN_ID}" --json jobs --jq '.jobs[] | [.name, .status, .conclusion] | @tsv' - gh run view "${RUN_ID}" --log-failed - ``` - - Read the failed job logs carefully to understand the root cause. - - Fix checks based on their actual names and errors (for example: `typecheck`, `lint`, `test`, `build`, `verify`, `executor`, `qa`, `audit`). - - Do not assume only a fixed set of CI job names. - - Re-run local equivalents of the failing jobs before pushing to confirm the CI issues are fixed. - - e. **Run verification**: Run the project's test/lint commands (e.g., `npm test`, `npm run lint`, `npm run verify` or equivalent). Fix until it passes. - - f. **Commit and push** the fixes (only if there are staged changes beyond the rebase): +Run the project's test/lint commands (e.g., `npm test`, `npm run lint`, `npm run verify` or equivalent). Fix until it passes. - ``` - git add - git commit -m "fix: address PR review feedback and CI failures +#### C.6: Commit and Push - - +Commit and push the fixes (only if there are staged changes): - Rebased onto and resolved merge conflicts. - Review score was /100. - CI failures fixed: , . +``` +git add +git commit -m "fix: address PR review feedback - Addressed: - - - - +- - Co-Authored-By: Claude Opus 4.6 " +Rebased onto and resolved merge conflicts. +Review score was /100. +CI failures fixed: , . - git push origin - ``` +Addressed: +- +- + +Co-Authored-By: Claude Opus 4.6 " - Note: if the only change was a conflict-free rebase, the `--force-with-lease` push from step (b) is sufficient -- no extra commit needed. +git push origin +``` - g. **Comment on the PR** summarizing what was addressed: +Note: if the only change was a conflict-free rebase, the `--force-with-lease` push from step C.2 is sufficient -- no extra commit needed. - ``` - gh pr comment --body "## Night Watch PR Fix +#### C.7: Comment on the PR + +Summarize what was addressed: - ### Merge Conflicts Resolved: - Rebased onto ``. Resolved conflicts in: , . +``` +gh pr comment --body "## Night Watch PR Fix - Previous review score: **/100** +### Merge Conflicts Resolved: +Rebased onto \`\`. Resolved conflicts in: , . - ### Changes made: - - - - +Previous review score: **/100** - ### Review Feedback Not Applied: - - : +### Changes made: +- +- - ### CI Failures Fixed: - - : +### Review Feedback Not Applied: +- : - Verification passes locally. Ready for re-review. +### CI Failures Fixed: +- : - Night Watch PR Reviewer" - ``` +\`npm run verify\` passes locally. Ready for re-review. - h. **Do not manage worktrees directly**: - - Do **not** run `git worktree add`, `git worktree remove`, or `git worktree prune`. - - The cron wrapper handles worktree lifecycle. +Night Watch PR Reviewer" +``` -5. **Repeat** for all open PRs that need work. +3. **Repeat** for all open PRs that need work. -6. When done, return to ${DEFAULT_BRANCH}: `git checkout ${DEFAULT_BRANCH}` +4. When done, return to ${DEFAULT_BRANCH}: `git checkout ${DEFAULT_BRANCH}` -Start now. Check for open PRs that need merge conflicts resolved, review feedback addressed, or CI failures fixed. +Start now. Check for open PRs that need review-first, fix-later processing. --- diff --git a/packages/cli/src/__tests__/scripts/night-watch-helpers.test.ts b/packages/cli/src/__tests__/scripts/night-watch-helpers.test.ts index b974c5c1..7338cacb 100644 --- a/packages/cli/src/__tests__/scripts/night-watch-helpers.test.ts +++ b/packages/cli/src/__tests__/scripts/night-watch-helpers.test.ts @@ -124,10 +124,14 @@ exit 0 { mode: 0o755 }, ); - const result = runShell(`source "${helpersScript}"; find_executor_resume_pr "night-watch"`, tempDir, { - ...process.env, - PATH: `${fakeBinDir}:${process.env.PATH}`, - }); + const result = runShell( + `source "${helpersScript}"; find_executor_resume_pr "night-watch"`, + tempDir, + { + ...process.env, + PATH: `${fakeBinDir}:${process.env.PATH}`, + }, + ); expect(result.status).toBe(0); expect(result.stdout.trim()).toBe(''); diff --git a/scripts/night-watch-pr-reviewer-cron.sh b/scripts/night-watch-pr-reviewer-cron.sh index 0b0f9259..00a76089 100755 --- a/scripts/night-watch-pr-reviewer-cron.sh +++ b/scripts/night-watch-pr-reviewer-cron.sh @@ -10,6 +10,8 @@ set -euo pipefail # NW_REVIEWER_MAX_RUNTIME=3600 - Maximum runtime in seconds (1 hour) # NW_PROVIDER_CMD=claude - AI provider CLI to use (claude, codex, etc.) # NW_DRY_RUN=0 - Set to 1 for dry-run mode (prints diagnostics only) +# NW_AUTO_MERGE=0 - Set to 1 to enable auto-merge +# NW_AUTO_MERGE_METHOD=squash - Merge method: squash, merge, or rebase PROJECT_DIR="${1:?Usage: $0 /path/to/project}" PROJECT_NAME=$(basename "${PROJECT_DIR}") @@ -21,6 +23,8 @@ PROVIDER_CMD="${NW_PROVIDER_CMD:-claude}" PROVIDER_LABEL="${NW_PROVIDER_LABEL:-}" MIN_REVIEW_SCORE="${NW_MIN_REVIEW_SCORE:-80}" BRANCH_PATTERNS_RAW="${NW_BRANCH_PATTERNS:-feat/,night-watch/}" +AUTO_MERGE="${NW_AUTO_MERGE:-0}" +AUTO_MERGE_METHOD="${NW_AUTO_MERGE_METHOD:-squash}" TARGET_PR="${NW_TARGET_PR:-}" PARALLEL_ENABLED="${NW_REVIEWER_PARALLEL:-1}" WORKER_MODE="${NW_REVIEWER_WORKER_MODE:-0}" @@ -63,11 +67,6 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # shellcheck source=night-watch-helpers.sh source "${SCRIPT_DIR}/night-watch-helpers.sh" -# Ensure provider CLI is on PATH (nvm, fnm, volta, common bin dirs) -if ! ensure_provider_on_path "${PROVIDER_CMD}"; then - echo "ERROR: Provider '${PROVIDER_CMD}' not found in PATH or common installation locations" >&2 - exit 127 -fi PROJECT_RUNTIME_KEY=$(project_runtime_key "${PROJECT_DIR}") PROVIDER_MODEL_DISPLAY=$(resolve_provider_model_display "${PROVIDER_CMD}" "${PROVIDER_LABEL}") GLOBAL_LOCK_FILE="/tmp/night-watch-pr-reviewer-${PROJECT_RUNTIME_KEY}.lock" @@ -78,6 +77,9 @@ else LOCK_FILE="${GLOBAL_LOCK_FILE}" fi +# ── Global Job Queue Gate ──────────────────────────────────────────────────── +# Acquire global gate before per-project lock to serialize jobs across projects. +# When gate is busy, enqueue the job and exit cleanly. SCRIPT_TYPE="reviewer" READY_FOR_REVIEW_LABEL="${NW_READY_FOR_REVIEW_LABEL:-ready-for-review}" READY_FOR_REVIEW_MARKER_NAME="night-watch-ready-for-review" @@ -101,6 +103,45 @@ extract_review_score_from_text() { | grep -oP '\d+(?=/100)' || echo "" } +# Extract the full body of the most recent review comment containing a score. +# Uses jq to process complete JSON strings, correctly handling multi-line bodies. +# Returns the review comment text (up to 8000 chars, truncated for prompt injection). +get_pr_latest_review_body() { + local pr_number="${1:-}" + local repo="${2:-}" + local review_body="" + # jq regex to match score patterns like "Score: 72/100" or "**Overall Score:** 85/100" + local score_regex='(?:Overall\s+)?Score:\*?\*?\s*[0-9]+/100' + + if [ -z "${pr_number}" ]; then + echo "" + return + fi + + # Use jq to select the last comment body containing a score pattern. + # jq processes the full JSON string (including embedded newlines), so multi-line + # review bodies are matched correctly. "m" flag enables multi-line mode in jq regex. + review_body=$( + gh pr view "${pr_number}" --json comments \ + --jq "[.comments[].body | select(test(\"${score_regex}\"; \"m\"))] | last // empty" 2>/dev/null || true + ) + + # Fallback to gh api if pr view returned nothing (e.g. auth scope differences) + if [ -z "${review_body}" ] && [ -n "${repo}" ]; then + review_body=$( + gh api "repos/${repo}/issues/${pr_number}/comments" \ + --jq "[.[].body | select(test(\"${score_regex}\"; \"m\"))] | last // empty" 2>/dev/null || true + ) + fi + + # Truncate to 8000 chars to avoid prompt bloat + if [ ${#review_body} -gt 8000 ]; then + review_body="${review_body:0:8000}" + fi + + printf '%s' "${review_body}" +} + build_ready_for_review_marker() { local head_sha="${1:-}" [ -z "${head_sha}" ] && return 1 @@ -176,7 +217,7 @@ This PR is ready for human review and merge." } # ── Global Job Queue Gate ──────────────────────────────────────────────────── -# Atomically claim a DB slot or enqueue for later dispatch — no flock needed. +# Atomically claim a DB slot or enqueue for later dispatch. if [ "${NW_QUEUE_ENABLED:-0}" = "1" ]; then if [ "${NW_QUEUE_INHERITED_SLOT:-0}" = "1" ]; then : @@ -232,7 +273,7 @@ emit_final_status() { Provider (model): ${PROVIDER_MODEL_DISPLAY} Processed PRs: ${prs_summary} ${final_score_line} -No automated fixes needed — ready for human review & merge." +No automated fixes needed - ready for human review & merge." else send_telegram_status_message "🔍 Night Watch Reviewer: completed" "Project: ${PROJECT_NAME} Provider (model): ${PROVIDER_MODEL_DISPLAY} @@ -249,7 +290,7 @@ Auto-merge failed: ${auto_merge_failed_summary}" if [ -n "${final_score}" ]; then details="${details}|final_score=${final_score}" fi - log "TIMEOUT: PR reviewer timed out (runtime budget ${MAX_RUNTIME}s)" + log "TIMEOUT: PR reviewer killed after ${MAX_RUNTIME}s" if [ "${WORKER_MODE}" != "1" ]; then send_telegram_status_message "🔍 Night Watch Reviewer: timeout" "Project: ${PROJECT_NAME} Provider (model): ${PROVIDER_MODEL_DISPLAY} @@ -618,7 +659,7 @@ fi rotate_log log_separator log "RUN-START: reviewer invoked project=${PROJECT_DIR} provider=${PROVIDER_CMD} worker=${WORKER_MODE} target_pr=${TARGET_PR:-all} parallel=${PARALLEL_ENABLED}" -log "CONFIG: max_runtime=${MAX_RUNTIME}s min_review_score=${MIN_REVIEW_SCORE} branch_patterns=${BRANCH_PATTERNS_RAW}" +log "CONFIG: max_runtime=${MAX_RUNTIME}s min_review_score=${MIN_REVIEW_SCORE} auto_merge=${AUTO_MERGE} branch_patterns=${BRANCH_PATTERNS_RAW}" if ! acquire_lock "${LOCK_FILE}"; then emit_result "skip_locked" @@ -845,6 +886,10 @@ if [ -z "${TARGET_PR}" ] && [ "${WORKER_MODE}" != "1" ] && [ "${PARALLEL_ENABLED echo "Provider (model): ${PROVIDER_MODEL_DISPLAY}" echo "Branch Patterns: ${BRANCH_PATTERNS_RAW}" echo "Min Review Score: ${MIN_REVIEW_SCORE}" + echo "Auto-merge: ${AUTO_MERGE}" + if [ "${AUTO_MERGE}" = "1" ]; then + echo "Auto-merge Method: ${AUTO_MERGE_METHOD}" + fi echo "Max PRs Per Run: ${REVIEWER_MAX_PRS_PER_RUN}" echo "Open PRs needing work:${PRS_NEEDING_WORK}" echo "Default Branch: ${DEFAULT_BRANCH}" @@ -897,33 +942,13 @@ if [ -z "${TARGET_PR}" ] && [ "${WORKER_MODE}" != "1" ] && [ "${PARALLEL_ENABLED worker_pr="${WORKER_PRS[$idx]}" worker_output="${WORKER_OUTPUTS[$idx]}" - # Guard: abort the wait loop when the global budget is exhausted - PARENT_ELAPSED=$(( $(date +%s) - SCRIPT_START_TIME )) - PARENT_REMAINING=$(( MAX_RUNTIME - PARENT_ELAPSED )) - if [ "${PARENT_REMAINING}" -le 0 ]; then - log "PARALLEL: global timeout exhausted — killing remaining workers" - for remaining_idx in $(seq "${idx}" $(( ${#WORKER_PIDS[@]} - 1 ))); do - kill "${WORKER_PIDS[$remaining_idx]}" 2>/dev/null || true - done - EXIT_CODE=124 - break - fi - - # Watchdog: kill the worker if it outlives the remaining budget - ( sleep "${PARENT_REMAINING}" 2>/dev/null; kill "${worker_pid}" 2>/dev/null || true ) & - watchdog_pid=$! - worker_exit_code=0 - if wait "${worker_pid}" 2>/dev/null; then + if wait "${worker_pid}"; then worker_exit_code=0 else worker_exit_code=$? fi - # Cancel the watchdog — the worker finished in time - kill "${watchdog_pid}" 2>/dev/null || true - wait "${watchdog_pid}" 2>/dev/null || true - if [ -f "${worker_output}" ] && [ -s "${worker_output}" ]; then cat "${worker_output}" >> "${LOG_FILE}" fi @@ -1008,6 +1033,10 @@ if [ "${NW_DRY_RUN:-0}" = "1" ]; then echo "Provider (model): ${PROVIDER_MODEL_DISPLAY}" echo "Branch Patterns: ${BRANCH_PATTERNS_RAW}" echo "Min Review Score: ${MIN_REVIEW_SCORE}" + echo "Auto-merge: ${AUTO_MERGE}" + if [ "${AUTO_MERGE}" = "1" ]; then + echo "Auto-merge Method: ${AUTO_MERGE_METHOD}" + fi echo "Max Retries: ${REVIEWER_MAX_RETRIES}" echo "Retry Delay: ${REVIEWER_RETRY_DELAY}s" echo "Max PRs Per Run: ${REVIEWER_MAX_PRS_PER_RUN}" @@ -1022,6 +1051,14 @@ if [ "${NW_DRY_RUN:-0}" = "1" ]; then exit 0 fi +# Ensure provider CLI is on PATH only after the scan determines there is work +# that requires invoking the provider. Pure skip paths should not fail just +# because the runner lacks an AI CLI. +if ! ensure_provider_on_path "${PROVIDER_CMD}"; then + echo "ERROR: Provider '${PROVIDER_CMD}' not found in PATH or common installation locations" >&2 + exit 127 +fi + if ! prepare_detached_worktree "${PROJECT_DIR}" "${REVIEW_WORKTREE_DIR}" "${DEFAULT_BRANCH}" "${LOG_FILE}"; then log "FAIL: Unable to create isolated reviewer worktree ${REVIEW_WORKTREE_DIR}" exit 1 @@ -1077,18 +1114,22 @@ if [ -n "${TARGET_PR}" ]; then else TARGET_SCOPE_PROMPT+=$'- failing checks: none detected\n' fi + if [ -n "${TARGET_SCORE}" ]; then TARGET_SCOPE_PROMPT+=$'- latest review score: '"${TARGET_SCORE}"$'/100\n' - TARGET_SCOPE_PROMPT+=$'- action: fix\n' - # Inject the latest review comment body for the fix prompt - REVIEW_BODY=$(gh api "repos/$(gh repo view --json nameWithOwner --jq '.nameWithOwner' 2>/dev/null)/issues/${TARGET_PR}/comments" --jq '[.[] | select(.body | test("Overall Score|Score:.*[0-9]+/100"))] | last | .body // ""' 2>/dev/null || echo "") - if [ -n "${REVIEW_BODY}" ]; then - TRUNCATED_REVIEW=$(printf '%s' "${REVIEW_BODY}" | head -c 6000) - TARGET_SCOPE_PROMPT+=$'\n## Latest Review Feedback\n'"${TRUNCATED_REVIEW}"$'\n' + # Review-first, fix-later flow: inject review body when score < threshold + if [ "${TARGET_SCORE}" -lt "${MIN_REVIEW_SCORE}" ]; then + TARGET_REVIEW_BODY=$(get_pr_latest_review_body "${TARGET_PR}" "${REPO}") + if [ -n "${TARGET_REVIEW_BODY}" ]; then + TARGET_SCOPE_PROMPT+=$'\n\n## Latest Review Feedback\nThe following review was posted for this PR. Address ALL issues mentioned:\n'"${TARGET_REVIEW_BODY}"$'\n' + else + TARGET_SCOPE_PROMPT+=$'\n\n## Latest Review Feedback\n- action: fix (review score below threshold)\n' + fi fi else + # No review yet - instruct Claude to post a review TARGET_SCOPE_PROMPT+=$'- latest review score: not found\n' - TARGET_SCOPE_PROMPT+=$'- action: review\n' + TARGET_SCOPE_PROMPT+=$'\n\n## Action Required\n- action: review (no review exists yet)\n- Post a review comment with a score using the criteria from .github/prompts/pr-review.md\n- Do NOT fix anything - just review and exit\n' fi fi @@ -1291,7 +1332,7 @@ for ATTEMPT in $(seq 1 "${TOTAL_ATTEMPTS}"); do continue fi log "RETRY: No review score found for PR #${TARGET_PR} after ${TOTAL_ATTEMPTS} attempts; labeling needs-human-review and failing run." - gh pr edit "${TARGET_PR}" --add-label "needs-human-review" 2>/dev/null || true + gh pr edit "${TARGET_PR}" --add-label "needs-human-review" 2>/dev/null || true EXIT_CODE=1 break fi @@ -1332,9 +1373,78 @@ if [ "${EXIT_CODE}" -eq 0 ] && [ -n "${TARGET_PR}" ] && [ -n "${PR_BRANCH_HEAD_B fi fi +# ── Auto-merge eligible PRs ───────────────────────────────────────────────────── +# After the reviewer completes, check for PRs that are merge-ready and queue them +# for auto-merge if enabled. Uses gh pr merge --auto to respect GitHub branch protection. AUTO_MERGED_PRS="" AUTO_MERGE_FAILED_PRS="" +if [ "${AUTO_MERGE}" = "1" ] && [ ${EXIT_CODE} -eq 0 ]; then + log "AUTO-MERGE: Checking for merge-ready PRs..." + + while IFS=$'\t' read -r pr_number pr_branch; do + if [ -z "${pr_number}" ] || [ -z "${pr_branch}" ]; then + continue + fi + + if [ -n "${TARGET_PR}" ] && [ "${pr_number}" != "${TARGET_PR}" ]; then + continue + fi + + # Only process PRs matching branch patterns + if [ -z "${TARGET_PR}" ] && ! printf '%s\n' "${pr_branch}" | grep -Eq "${BRANCH_REGEX}"; then + continue + fi + + # Check CI status - must have ALL checks passing (not just "no failures") + # gh pr checks exits 0 if all pass, 8 if pending, non-zero otherwise + if ! gh pr checks "${pr_number}" --required >/dev/null 2>&1; then + log "AUTO-MERGE: PR #${pr_number} has pending or failed CI checks" + continue + fi + + # Check review score - must have score >= threshold + ALL_COMMENTS=$( + { + gh pr view "${pr_number}" --json comments --jq '.comments[].body' 2>/dev/null || true + if [ -n "${REPO}" ]; then + gh api "repos/${REPO}/issues/${pr_number}/comments" --jq '.[].body' 2>/dev/null || true + fi + } | awk '!seen[$0]++' + ) + LATEST_SCORE=$(extract_review_score_from_text "${ALL_COMMENTS}") + + # Skip PRs without a score + if [ -z "${LATEST_SCORE}" ]; then + continue + fi + + # Skip PRs with score below threshold + if [ "${LATEST_SCORE}" -lt "${MIN_REVIEW_SCORE}" ]; then + continue + fi + + # PR is merge-ready - queue for auto-merge + log "AUTO-MERGE: PR #${pr_number} (${pr_branch}) — score ${LATEST_SCORE}/100, CI passing" + + if gh pr merge "${pr_number}" --"${AUTO_MERGE_METHOD}" --auto --delete-branch 2>>"${LOG_FILE}"; then + log "AUTO-MERGE: Successfully queued merge for PR #${pr_number}" + if [ -z "${AUTO_MERGED_PRS}" ]; then + AUTO_MERGED_PRS="#${pr_number}" + else + AUTO_MERGED_PRS="${AUTO_MERGED_PRS},#${pr_number}" + fi + else + log "WARN: Auto-merge failed for PR #${pr_number}" + if [ -z "${AUTO_MERGE_FAILED_PRS}" ]; then + AUTO_MERGE_FAILED_PRS="#${pr_number}" + else + AUTO_MERGE_FAILED_PRS="${AUTO_MERGE_FAILED_PRS},#${pr_number}" + fi + fi + done < <(gh pr list --state open --json number,headRefName --jq '.[] | [.number, .headRefName] | @tsv' 2>/dev/null || true) +fi + REVIEWER_TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START_TIME )) log "OUTCOME: exit_code=${EXIT_CODE} total_elapsed=${REVIEWER_TOTAL_ELAPSED}s prs=${PRS_NEEDING_WORK_CSV:-none} attempts=${ATTEMPTS_MADE}" emit_final_status "${EXIT_CODE}" "${PRS_NEEDING_WORK_CSV}" "${AUTO_MERGED_PRS}" "${AUTO_MERGE_FAILED_PRS}" "${ATTEMPTS_MADE}" "${FINAL_SCORE}" "${NO_CHANGES_NEEDED}" "${NO_CHANGES_PRS}"