fix: avoid thread-only review autofix dispatches by seonghobae · Pull Request #270 · ContextualWisdomLab/.github

seonghobae · 2026-07-01T06:14:52Z

Summary

stop PR review autofix dispatches that are triggered only by unresolved review threads
keep autofix limited to current-head OpenCode change requests that are classified as safe to mutate
extend scheduler self-test coverage for unresolved-thread-only PRs

Why

In ContextualWisdomLab/newsdom-api#253, the scheduler kept dispatching autofix work after the actual review issue was already addressed. The remaining signal was an unresolved reviewer thread, which the approval/merge gates should enforce, but it is too broad to let an automated fixer mutate the PR branch.

Verification

py -3 scripts/ci/pr_review_fix_scheduler.py --self-test
git diff --check

Copilot

Pull request overview

This PR tightens the PR review autofix scheduler so it only dispatches autofix work when there is an autofixable OpenCode “changes requested” review on the current head, and not merely because there are unresolved review threads. It also extends the scheduler’s self-test coverage to validate “unresolved-thread-only” PR scenarios.

Changes:

Remove unresolved-review-thread counting from needs_autofix, limiting eligibility to current-head OpenCode change requests that are classified as safe to mutate.
Update inspect_pr skip messaging to reflect the narrower autofix trigger condition.
Extend --self-test cases to include unresolved-thread-only PRs (no reviews, unresolved threads present).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

opencode-agent · 2026-07-01T07:06:55Z

OpenCode Review Overview

Head SHA: bc4a90fdcc9bc97eb3ede85cacb2e835abe4b3c8
Workflow run: 28508508411
Workflow attempt: 1
Gate result: REQUEST_CHANGES (approval step)

Pull request overview

OpenCode exhausted the configured model pool without a usable current-head review conclusion. This is not approval evidence, so the PR is blocked until a source-backed review can establish approval sufficiency or identify concrete fixes.

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Problem: every configured model path failed to produce a usable current-head control block.
Root cause: model execution, timeout, export, normalization, or approval-gate validation did not complete after exponential retry across the configured model pool.
Impact: approving from deterministic check state alone would miss PR-intent mismatches, missing files, edge-case bugs, robustness gaps, UX/DX regressions, security issues, and CodeGraph-backed base/head flow changes.
Fix: rerun OpenCode after model availability recovers, or update the PR with the missing files, tests, docs, generated artifacts, and verification evidence needed for a source-backed review conclusion.
Regression test: keep the approval gate posting REQUEST_CHANGES, not APPROVE or check-only failure, when no model produces a valid current-head review.

Summary

Result: REQUEST_CHANGES
Reason: coverage-evidence passed and peer GitHub Checks completed without failures, but no model produced a valid review control block.
Deterministic evidence checked but not used for approval: current-head changed-file evidence (.github/workflows/opencode-review.yml, scripts/ci/opencode_review_approve_gate.sh, scripts/ci/pr_review_fix_scheduler.py, scripts/ci/test_strix_quick_gate.sh, tests/test_opencode_agent_contract.py, tests/test_pr_review_fix_scheduler.py); coverage-evidence result success; peer checks from statusCheckRollup excluding this OpenCode check.
Model outcome: model_pool=exhausted; selected_model=none.
Head SHA: bc4a90fdcc9bc97eb3ede85cacb2e835abe4b3c8
Workflow run: 28508508411
Workflow attempt: 1

No PR approval was posted because model-output failure is not evidence that the PR has no blockers.

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["CI script (3 files)"]
  S2 --> I2["review and security gate shell path"]
  I2 --> R2["Review risk: CI script (3 files)"]
  R2 --> V2["bash -n plus Strix self-test"]
  Evidence --> S3["Test (2 files)"]
  S3 --> I3["regression suite"]
  I3 --> R3["Review risk: Test (2 files)"]
  R3 --> V3["targeted test run"]

opencode-agent

Pull request overview

OpenCode reviewed the current-head bounded evidence and found no blocking issues.

Findings

No blocking findings.

Summary

Approval sufficiency: bounded evidence supplied affirmative approval evidence for changed files, coverage/docstring posture, risk surfaces, and current-head verification; approval is not based merely on the absence of known blockers.
Verification posture: CodeGraph evidence was initialized and bounded current-head evidence reviewed for changed-file evidence including scripts/ci/pr_review_fix_scheduler.py, tests/test_pr_review_fix_scheduler.py.
Linter/static: workflow/static review evidence is bounded by the current-head GitHub Checks gate and changed-file evidence.
TDD/regression: coverage execution evidence and focused changed hunks were reviewed from bounded-review-evidence.md.
Coverage: coverage execution evidence reports supported repository test suites passed.
Docstring coverage: coverage execution evidence reports configured repository docstring gates passed or docstring coverage was advisory.
DAG: CodeGraph/source-backed behavior map connects scripts/ci/pr_review_fix_scheduler.py to the affected review, runtime, or workflow path and required checks.
PoC/execution: coverage-evidence job executed on the current head and reported PASS.
DDD/domain: workflow and repository-governance invariants were reviewed against changed files in bounded evidence.
CDD/context: CodeGraph evidence, changed-file history, and focused hunks were reviewed from bounded-review-evidence.md.
Similar issues: changed-file history evidence was reviewed for comparable local precedents.
Claim/concept check: bounded evidence, repository source, current-head workflow evidence, and, where numeric, scientific, statistical, or literature-backed claims are affected, original-paper/formula evidence and parameter-recovery expectations were used for claims.
Standards search: standards and external-source checks are delegated to configured OpenCode web_search/Context7/DeepWiki sources when applicable; no evidence-backed standards blocker is present in bounded evidence.
Compatibility/convention: changed workflow/script conventions, object naming, and reserved-word safety for schema/API/config/code surfaces were checked in bounded evidence.
Breaking-change/backcompat: deployment evidence and changed-file history were checked for backward-compatibility risk.
Performance: changed surfaces were checked for performance risk in bounded evidence.
Developer experience: changed automation, review, test, setup, and maintenance surfaces were checked for helpful or obstructive DX impact in bounded evidence.
User experience: connected user, operator, API, CLI, documentation, review-comment, status-check, rendering, and workflow-reader behavior was checked for contradictions against code, docs, and tests in bounded evidence.
Visual/DOM: Playwright visual, DOM locator, ARIA snapshot, console, and responsive evidence were checked when a web UI surface was present; for non-web surfaces, API/CLI/log/docs/workflow interaction evidence was reviewed instead.
Accessibility/i18n: accessibility, localization, and human-readable text surfaces were checked where UI, CLI, API message, docs, logs, or review text changed.
Supply-chain/license: dependency, package, model, container, and external-tool changes were checked in bounded evidence.
Packaging: package, build, test, lint, and security contracts were checked in bounded evidence.
Security/privacy: workflow-token, review-gate, and repository-automation security/privacy boundaries were checked in bounded evidence.

Result: APPROVE
Reason: No material issues found after thorough review
Head SHA: c5351ec63e701192503b411cc5855d7ce70e5d4d
Workflow run: 28499423326
Workflow attempt: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["CI script: pr_review_fix_scheduler.py"]
  S1 --> I1["review and security gate shell path"]
  I1 --> R1["Review risk: CI script: pr_review_fix_scheduler.py"]
  R1 --> V1["bash -n plus Strix self-test"]
  Evidence --> S2["Test: test_pr_review_fix_scheduler.py"]
  S2 --> I2["regression suite"]
  I2 --> R2["Review risk: Test: test_pr_review_fix_scheduler.py"]
  R2 --> V2["targeted test run"]

…only-autofix # Conflicts: # .github/workflows/opencode-review.yml

github-actions

Pull request overview

OpenCode cannot approve yet because required coverage evidence did not pass.

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Problem: The required coverage-evidence job result was failure, so OpenCode cannot establish approval sufficiency for this head.
Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.
Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports success with required evidence or explicit no-source not-applicable evidence.
Regression test: Keep the approval branch checking needs.coverage-evidence.result == success before posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present.
Result: REQUEST_CHANGES
Reason: coverage-evidence result was failure, so required test/docstring evidence was not proven for current head 8c1ae449e33036c19e4fe018f67e41be1db83d5d.
Head SHA: 8c1ae449e33036c19e4fe018f67e41be1db83d5d
Workflow run: 28507409257
Workflow attempt: 1

Coverage evidence

Coverage Evidence

Head SHA: 8c1ae449e33036c19e4fe018f67e41be1db83d5d
Required test evidence: supported repository test suites must pass.
Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.

Python project dependencies (.)

Using CPython 3.12.3 interpreter at: /usr/bin/python3
Creating virtual environment at: .venv
Resolved 17 packages in 119ms
Downloading pygments (1.2MiB)
 Downloaded pygments
Prepared 13 packages in 110ms
Installed 13 packages in 12ms
 + attrs==26.1.0
 + click==8.4.2
 + colorama==0.4.6
 + coverage==7.14.3
 + iniconfig==2.3.0
 + interrogate==1.7.0
 + packaging==26.2
 + pluggy==1.6.0
 + py==1.11.0
 + pygments==2.20.0
 + pytest==9.1.1
 + pytest-cov==7.1.0
 + tabulate==0.10.0

Result: PASS

Python coverage with missing-line report (.)

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.1.1, pluggy-1.6.0
rootdir: /home/runner/work/.github/.github/pr-head
configfile: pyproject.toml
plugins: cov-7.1.0
collected 159 items

tests/test_assert_opencode_reasoning_effort.py .......                   [  4%]
tests/test_noema_review_gate.py ..........                               [ 10%]
tests/test_opencode_agent_contract.py ...F.......                        [ 17%]
tests/test_opencode_review_normalize_output.py ......................... [ 33%]
                                                                         [ 33%]
tests/test_opencode_workflow_shell_syntax.py .                           [ 33%]
tests/test_pr_governance_audit_contract.py ..                            [ 35%]
tests/test_pr_review_fix_scheduler.py ...................                [ 47%]
tests/test_pr_review_fix_scheduler_coverage.py ..                        [ 48%]
tests/test_pr_review_merge_scheduler.py ................................ [ 68%]
.............................                                            [ 86%]
tests/test_render_opencode_prompt_template.py ....                       [ 89%]
tests/test_review_execution_contracts.py ..                              [ 90%]
tests/test_sandboxed_verify.py .........                                 [ 96%]
tests/test_sandboxed_web_e2e.py ......                                   [100%]

=================================== FAILURES ===================================
___________ test_workflow_provisions_sandbox_tool_and_reviewer_agent ___________

    def test_workflow_provisions_sandbox_tool_and_reviewer_agent():
        """Guard the runtime OpenCode workspace, not only repo-local config."""
        workflow = Path(".github/workflows/opencode-review.yml").read_text(
            encoding="utf-8"
        )
    
        assert "code-reviewer-prompt.md" in workflow
        assert "sandboxed_verify.py" in workflow
        assert "sandboxed_web_e2e.py" in workflow
        assert "review_execution_contracts.py" in workflow
        assert "SANDBOXED_VERIFY_RESULT" in workflow
        assert "SANDBOXED_WEB_E2E_RESULT" in workflow
        assert "Docker Compose, devcontainer, Nix, or temporary package-install sandbox" in workflow
        assert "scientific, statistical, simulation" in workflow
        assert "skewed true" in workflow
        assert "object naming" in workflow
        assert "connected code paths, rendering paths" in workflow
        assert "CHECK_LOOKUP_GH_TOKEN" in workflow
        assert "retrying with workflow github token" in workflow
        assert 'review_write_token="$GH_TOKEN"' in workflow
        assert 'review_write_token="$OPENCODE_APP_TOKEN"' in workflow
        assert 'review_write_token="$CHECK_LOOKUP_GH_TOKEN"' in workflow
        assert 'review_write_token="${OPENCODE_APP_TOKEN:-$GH_TOKEN}"' not in workflow
        assert "Review execution contracts" in workflow
        assert "Accessibility/i18n:" in workflow
        assert "Supply-chain/license:" in workflow
        assert "Packaging:" in workflow
        assert 'gsub("`"; "\'")' not in workflow
        assert 'gsub("`"; "&apos;")' in workflow
        assert '"code-reviewer"' in workflow
        assert workflow.count('"reasoningEffort": "high"') >= 10
        assert '"task": "allow"' in workflow
        assert 'cat >"$prompt_file" <<EOF' not in workflow
        assert 'cat >"$prompt_file" <<\'EOF\'' not in workflow
        assert "Run OpenCode PR Review model pool" in workflow
        assert "opencode_review_model_pool" in workflow
        assert "run_opencode_review_model_pool.sh" in workflow
        assert "OPENCODE_MODEL_CANDIDATES" in workflow
        model_pool_runner = Path("scripts/ci/run_opencode_review_model_pool.sh").read_text(encoding="utf-8")
        assert "assert_reasoning_effort_for_candidate" in model_pool_runner
        assert "assert_opencode_reasoning_effort.py" in model_pool_runner
        assert "--config opencode.jsonc" in model_pool_runner
        reasoning_effort_guard = Path("scripts/ci/assert_opencode_reasoning_effort.py").read_text(encoding="utf-8")
        assert 'options.reasoningEffort=high' in reasoning_effort_guard
        assert 'variants.high.reasoningEffort=high' in reasoning_effort_guard
        assert "deepseek/deepseek-r1" in reasoning_effort_guard
        assert "--config \"$OPENCODE_REVIEW_WORKDIR/opencode.jsonc\"" in workflow
        assert 'timeout --kill-after=15s "${export_timeout_seconds}s" opencode export' in model_pool_runner
        assert "session export did not complete within %ss" in model_pool_runner
        assert "Read and follow the complete review contract" in model_pool_runner
        assert "compact launcher as a reduced review policy" in model_pool_runner
        assert "is_context_overflow_failure" in model_pool_runner
        assert "tokens_limit_reached" in model_pool_runner
        assert "skipping remaining attempts for this model" in model_pool_runner
        assert "approve_low_risk_review_fallback_after_model_exhaustion" not in workflow
        assert "changed_file_is_low_risk_review_fallback" not in workflow
        assert "production source 또는 package manifest 변경이 없습니다" not in workflow
        assert "request_changes_for_coverage_evidence_failure" in workflow
        assert '"## Review outcome"' in workflow
        assert '"## Check outcome"' not in workflow
        assert "publish REQUEST_CHANGES when coverage-evidence blocker states" in workflow
        assert 'timeout-minutes: 75' in workflow
        assert re.search(r"Run OpenCode PR Review model pool[\s\S]{0,240}timeout-minutes: 20", workflow)
        assert 'APPROVAL_CHECK_WAIT_ATTEMPTS: "81"' in workflow
        assert 'APPROVAL_CHECK_WAIT_SLEEP_SECONDS: "30"' in workflow
        assert 'OPENCODE_MODEL_CANDIDATES: "github-models/openai/gpt-5-nano"' in workflow
        assert 'OPENCODE_MODEL_ATTEMPTS: "1"' in workflow
        assert 'OPENCODE_RUN_TIMEOUT_SECONDS: "240"' in workflow
        assert 'OPENCODE_EXPORT_TIMEOUT_SECONDS: "120"' in workflow
        assert 'OPENCODE_TOTAL_RETRY_BUDGET_SECONDS: "360"' in workflow
        assert 'OPENCODE_BACKOFF_MAX_SECONDS: "30"' in workflow
        assert "${{ runner.temp }}/opencode-review-model-pool.md" in workflow
>       assert re.search(r'check-runs" \\\n\s+-f per_page=100 \\\n\s+--paginate \\\n\s+--slurp \|\n\s+jq -r "\$jq_filter"', workflow)
E       assert None
E        +  where None = <function search at 0x7f26a8c74220>('check-runs" \\\\\\n\\s+-f per_page=100 \\\\\\n\\s+--paginate \\\\\\n\\s+--slurp \\|\\n\\s+jq -r "\\$jq_filter"', 'name: Required OpenCode Review\n\non:\n  pull_request_target:\n    types: [opened, synchronize, reopened, ready_for_r... The scheduled and PR-event scheduler paths remain authoritative.\\n\' "$GH_REPOSITORY" "$base_branch"\n          fi\n')
E        +    where <function search at 0x7f26a8c74220> = re.search

tests/test_opencode_agent_contract.py:224: AssertionError
=============================== warnings summary ===============================
tests/test_assert_opencode_reasoning_effort.py::test_module_entrypoint_success
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.assert_opencode_reasoning_effort' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.assert_opencode_reasoning_effort'; this may result in unpredictable behaviour

tests/test_render_opencode_prompt_template.py::test_module_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.render_opencode_prompt_template' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.render_opencode_prompt_template'; this may result in unpredictable behaviour

tests/test_review_execution_contracts.py::test_discovers_package_managers_java_r_json_and_main
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.review_execution_contracts' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.review_execution_contracts'; this may result in unpredictable behaviour

tests/test_sandboxed_verify.py::test_module_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_verify' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_verify'; this may result in unpredictable behaviour

tests/test_sandboxed_web_e2e.py::test_module_import_and_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_web_e2e' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_web_e2e'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_opencode_agent_contract.py::test_workflow_provisions_sandbox_tool_and_reviewer_agent - assert None
 +  where None = <function search at 0x7f26a8c74220>('check-runs" \\\\\\n\\s+-f per_page=100 \\\\\\n\\s+--paginate \\\\\\n\\s+--slurp \\|\\n\\s+jq -r "\\$jq_filter"', 'name: Required OpenCode Review\n\non:\n  pull_request_target:\n    types: [opened, synchronize, reopened, ready_for_r... The scheduled and PR-event scheduler paths remain authoritative.\\n\' "$GH_REPOSITORY" "$base_branch"\n          fi\n')
 +    where <function search at 0x7f26a8c74220> = re.search
================== 1 failed, 158 passed, 5 warnings in 5.58s ===================

Result: FAIL (exit 1)

Python docstring coverage advisory

RESULT: PASSED (minimum: 100.0%, actual: 100.0%)

Result: PASS

Coverage Decision

Result: FAIL
Test evidence: not proven passing
Docstring evidence: not proven passing when configured
Failure count: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["CI script (2 files)"]
  S2 --> I2["review and security gate shell path"]
  I2 --> R2["Review risk: CI script (2 files)"]
  R2 --> V2["bash -n plus Strix self-test"]
  Evidence --> S3["Test: test_pr_review_fix_scheduler.py"]
  S3 --> I3["regression suite"]
  I3 --> R3["Review risk: Test: test_pr_review_fix_scheduler.py"]
  R3 --> V3["targeted test run"]

github-actions

Pull request overview

OpenCode cannot approve yet because required coverage evidence did not pass.

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Problem: The required coverage-evidence job result was failure, so OpenCode cannot establish approval sufficiency for this head.
Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.
Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports success with required evidence or explicit no-source not-applicable evidence.
Regression test: Keep the approval branch checking needs.coverage-evidence.result == success before posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present.
Result: REQUEST_CHANGES
Reason: coverage-evidence result was failure, so required test/docstring evidence was not proven for current head 8c1ae449e33036c19e4fe018f67e41be1db83d5d.
Head SHA: 8c1ae449e33036c19e4fe018f67e41be1db83d5d
Workflow run: 28507409257
Workflow attempt: 2

Coverage evidence

Coverage Evidence

Head SHA: 8c1ae449e33036c19e4fe018f67e41be1db83d5d
Required test evidence: supported repository test suites must pass.
Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.

Python project dependencies (.)

Using CPython 3.12.3 interpreter at: /usr/bin/python3
Creating virtual environment at: .venv
Resolved 17 packages in 119ms
Downloading pygments (1.2MiB)
 Downloaded pygments
Prepared 13 packages in 110ms
Installed 13 packages in 12ms
 + attrs==26.1.0
 + click==8.4.2
 + colorama==0.4.6
 + coverage==7.14.3
 + iniconfig==2.3.0
 + interrogate==1.7.0
 + packaging==26.2
 + pluggy==1.6.0
 + py==1.11.0
 + pygments==2.20.0
 + pytest==9.1.1
 + pytest-cov==7.1.0
 + tabulate==0.10.0

Result: PASS

Python coverage with missing-line report (.)

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.1.1, pluggy-1.6.0
rootdir: /home/runner/work/.github/.github/pr-head
configfile: pyproject.toml
plugins: cov-7.1.0
collected 159 items

tests/test_assert_opencode_reasoning_effort.py .......                   [  4%]
tests/test_noema_review_gate.py ..........                               [ 10%]
tests/test_opencode_agent_contract.py ...F.......                        [ 17%]
tests/test_opencode_review_normalize_output.py ......................... [ 33%]
                                                                         [ 33%]
tests/test_opencode_workflow_shell_syntax.py .                           [ 33%]
tests/test_pr_governance_audit_contract.py ..                            [ 35%]
tests/test_pr_review_fix_scheduler.py ...................                [ 47%]
tests/test_pr_review_fix_scheduler_coverage.py ..                        [ 48%]
tests/test_pr_review_merge_scheduler.py ................................ [ 68%]
.............................                                            [ 86%]
tests/test_render_opencode_prompt_template.py ....                       [ 89%]
tests/test_review_execution_contracts.py ..                              [ 90%]
tests/test_sandboxed_verify.py .........                                 [ 96%]
tests/test_sandboxed_web_e2e.py ......                                   [100%]

=================================== FAILURES ===================================
___________ test_workflow_provisions_sandbox_tool_and_reviewer_agent ___________

    def test_workflow_provisions_sandbox_tool_and_reviewer_agent():
        """Guard the runtime OpenCode workspace, not only repo-local config."""
        workflow = Path(".github/workflows/opencode-review.yml").read_text(
            encoding="utf-8"
        )
    
        assert "code-reviewer-prompt.md" in workflow
        assert "sandboxed_verify.py" in workflow
        assert "sandboxed_web_e2e.py" in workflow
        assert "review_execution_contracts.py" in workflow
        assert "SANDBOXED_VERIFY_RESULT" in workflow
        assert "SANDBOXED_WEB_E2E_RESULT" in workflow
        assert "Docker Compose, devcontainer, Nix, or temporary package-install sandbox" in workflow
        assert "scientific, statistical, simulation" in workflow
        assert "skewed true" in workflow
        assert "object naming" in workflow
        assert "connected code paths, rendering paths" in workflow
        assert "CHECK_LOOKUP_GH_TOKEN" in workflow
        assert "retrying with workflow github token" in workflow
        assert 'review_write_token="$GH_TOKEN"' in workflow
        assert 'review_write_token="$OPENCODE_APP_TOKEN"' in workflow
        assert 'review_write_token="$CHECK_LOOKUP_GH_TOKEN"' in workflow
        assert 'review_write_token="${OPENCODE_APP_TOKEN:-$GH_TOKEN}"' not in workflow
        assert "Review execution contracts" in workflow
        assert "Accessibility/i18n:" in workflow
        assert "Supply-chain/license:" in workflow
        assert "Packaging:" in workflow
        assert 'gsub("`"; "\'")' not in workflow
        assert 'gsub("`"; "&apos;")' in workflow
        assert '"code-reviewer"' in workflow
        assert workflow.count('"reasoningEffort": "high"') >= 10
        assert '"task": "allow"' in workflow
        assert 'cat >"$prompt_file" <<EOF' not in workflow
        assert 'cat >"$prompt_file" <<\'EOF\'' not in workflow
        assert "Run OpenCode PR Review model pool" in workflow
        assert "opencode_review_model_pool" in workflow
        assert "run_opencode_review_model_pool.sh" in workflow
        assert "OPENCODE_MODEL_CANDIDATES" in workflow
        model_pool_runner = Path("scripts/ci/run_opencode_review_model_pool.sh").read_text(encoding="utf-8")
        assert "assert_reasoning_effort_for_candidate" in model_pool_runner
        assert "assert_opencode_reasoning_effort.py" in model_pool_runner
        assert "--config opencode.jsonc" in model_pool_runner
        reasoning_effort_guard = Path("scripts/ci/assert_opencode_reasoning_effort.py").read_text(encoding="utf-8")
        assert 'options.reasoningEffort=high' in reasoning_effort_guard
        assert 'variants.high.reasoningEffort=high' in reasoning_effort_guard
        assert "deepseek/deepseek-r1" in reasoning_effort_guard
        assert "--config \"$OPENCODE_REVIEW_WORKDIR/opencode.jsonc\"" in workflow
        assert 'timeout --kill-after=15s "${export_timeout_seconds}s" opencode export' in model_pool_runner
        assert "session export did not complete within %ss" in model_pool_runner
        assert "Read and follow the complete review contract" in model_pool_runner
        assert "compact launcher as a reduced review policy" in model_pool_runner
        assert "is_context_overflow_failure" in model_pool_runner
        assert "tokens_limit_reached" in model_pool_runner
        assert "skipping remaining attempts for this model" in model_pool_runner
        assert "approve_low_risk_review_fallback_after_model_exhaustion" not in workflow
        assert "changed_file_is_low_risk_review_fallback" not in workflow
        assert "production source 또는 package manifest 변경이 없습니다" not in workflow
        assert "request_changes_for_coverage_evidence_failure" in workflow
        assert '"## Review outcome"' in workflow
        assert '"## Check outcome"' not in workflow
        assert "publish REQUEST_CHANGES when coverage-evidence blocker states" in workflow
        assert 'timeout-minutes: 75' in workflow
        assert re.search(r"Run OpenCode PR Review model pool[\s\S]{0,240}timeout-minutes: 20", workflow)
        assert 'APPROVAL_CHECK_WAIT_ATTEMPTS: "81"' in workflow
        assert 'APPROVAL_CHECK_WAIT_SLEEP_SECONDS: "30"' in workflow
        assert 'OPENCODE_MODEL_CANDIDATES: "github-models/openai/gpt-5-nano"' in workflow
        assert 'OPENCODE_MODEL_ATTEMPTS: "1"' in workflow
        assert 'OPENCODE_RUN_TIMEOUT_SECONDS: "240"' in workflow
        assert 'OPENCODE_EXPORT_TIMEOUT_SECONDS: "120"' in workflow
        assert 'OPENCODE_TOTAL_RETRY_BUDGET_SECONDS: "360"' in workflow
        assert 'OPENCODE_BACKOFF_MAX_SECONDS: "30"' in workflow
        assert "${{ runner.temp }}/opencode-review-model-pool.md" in workflow
>       assert re.search(r'check-runs" \\\n\s+-f per_page=100 \\\n\s+--paginate \\\n\s+--slurp \|\n\s+jq -r "\$jq_filter"', workflow)
E       assert None
E        +  where None = <function search at 0x7f26a8c74220>('check-runs" \\\\\\n\\s+-f per_page=100 \\\\\\n\\s+--paginate \\\\\\n\\s+--slurp \\|\\n\\s+jq -r "\\$jq_filter"', 'name: Required OpenCode Review\n\non:\n  pull_request_target:\n    types: [opened, synchronize, reopened, ready_for_r... The scheduled and PR-event scheduler paths remain authoritative.\\n\' "$GH_REPOSITORY" "$base_branch"\n          fi\n')
E        +    where <function search at 0x7f26a8c74220> = re.search

tests/test_opencode_agent_contract.py:224: AssertionError
=============================== warnings summary ===============================
tests/test_assert_opencode_reasoning_effort.py::test_module_entrypoint_success
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.assert_opencode_reasoning_effort' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.assert_opencode_reasoning_effort'; this may result in unpredictable behaviour

tests/test_render_opencode_prompt_template.py::test_module_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.render_opencode_prompt_template' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.render_opencode_prompt_template'; this may result in unpredictable behaviour

tests/test_review_execution_contracts.py::test_discovers_package_managers_java_r_json_and_main
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.review_execution_contracts' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.review_execution_contracts'; this may result in unpredictable behaviour

tests/test_sandboxed_verify.py::test_module_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_verify' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_verify'; this may result in unpredictable behaviour

tests/test_sandboxed_web_e2e.py::test_module_import_and_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_web_e2e' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_web_e2e'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_opencode_agent_contract.py::test_workflow_provisions_sandbox_tool_and_reviewer_agent - assert None
 +  where None = <function search at 0x7f26a8c74220>('check-runs" \\\\\\n\\s+-f per_page=100 \\\\\\n\\s+--paginate \\\\\\n\\s+--slurp \\|\\n\\s+jq -r "\\$jq_filter"', 'name: Required OpenCode Review\n\non:\n  pull_request_target:\n    types: [opened, synchronize, reopened, ready_for_r... The scheduled and PR-event scheduler paths remain authoritative.\\n\' "$GH_REPOSITORY" "$base_branch"\n          fi\n')
 +    where <function search at 0x7f26a8c74220> = re.search
================== 1 failed, 158 passed, 5 warnings in 5.58s ===================

Result: FAIL (exit 1)

Python docstring coverage advisory

RESULT: PASSED (minimum: 100.0%, actual: 100.0%)

Result: PASS

Coverage Decision

Result: FAIL
Test evidence: not proven passing
Docstring evidence: not proven passing when configured
Failure count: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["CI script (2 files)"]
  S2 --> I2["review and security gate shell path"]
  I2 --> R2["Review risk: CI script (2 files)"]
  R2 --> V2["bash -n plus Strix self-test"]
  Evidence --> S3["Test: test_pr_review_fix_scheduler.py"]
  S3 --> I3["regression suite"]
  I3 --> R3["Review risk: Test: test_pr_review_fix_scheduler.py"]
  R3 --> V3["targeted test run"]

github-actions

Pull request overview

OpenCode exhausted the configured model pool without a usable current-head review conclusion. This is not approval evidence, so the PR is blocked until a source-backed review can establish approval sufficiency or identify concrete fixes.

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Problem: every configured model path failed to produce a usable current-head control block.
Root cause: model execution, timeout, export, normalization, or approval-gate validation did not complete after exponential retry across the configured model pool.
Impact: approving from deterministic check state alone would miss PR-intent mismatches, missing files, edge-case bugs, robustness gaps, UX/DX regressions, security issues, and CodeGraph-backed base/head flow changes.
Fix: rerun OpenCode after model availability recovers, or update the PR with the missing files, tests, docs, generated artifacts, and verification evidence needed for a source-backed review conclusion.
Regression test: keep the approval gate posting REQUEST_CHANGES, not APPROVE or check-only failure, when no model produces a valid current-head review.

Summary

Result: REQUEST_CHANGES
Reason: coverage-evidence passed and peer GitHub Checks completed without failures, but no model produced a valid review control block.
Deterministic evidence checked but not used for approval: current-head changed-file evidence (.github/workflows/opencode-review.yml, scripts/ci/opencode_review_approve_gate.sh, scripts/ci/pr_review_fix_scheduler.py, scripts/ci/test_strix_quick_gate.sh, tests/test_opencode_agent_contract.py, tests/test_pr_review_fix_scheduler.py); coverage-evidence result success; peer checks from statusCheckRollup excluding this OpenCode check.
Model outcome: model_pool=exhausted; selected_model=none.
Head SHA: bc4a90fdcc9bc97eb3ede85cacb2e835abe4b3c8
Workflow run: 28508508411
Workflow attempt: 1

No PR approval was posted because model-output failure is not evidence that the PR has no blockers.

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["CI script (3 files)"]
  S2 --> I2["review and security gate shell path"]
  I2 --> R2["Review risk: CI script (3 files)"]
  R2 --> V2["bash -n plus Strix self-test"]
  Evidence --> S3["Test (2 files)"]
  S3 --> I3["regression suite"]
  I3 --> R3["Review risk: Test (2 files)"]
  R3 --> V3["targeted test run"]

github-actions · 2026-07-01T09:56:43Z

    needs_fix, reasons = needs_autofix(pr)
    if not needs_fix:
-        return "skip", ("no current-head change request or active unresolved review thread",)
+        return "skip", ("no autofixable current-head OpenCode change request",)


HIGH OpenCode could not establish approval sufficiency

Problem: the model pool exhausted without a valid current-head review control block, so this changed line cannot be approved from deterministic check state alone.

Impact: PR-intent mismatches, missing files, robustness bugs, UX/DX regressions, and CodeGraph-backed flow changes could be missed.

Fix: rerun OpenCode after model availability recovers, or add the missing source/test/docs/generated verification evidence needed for a source-backed approval.

Verification: rerun the OpenCode Review workflow and confirm it emits APPROVE or source-backed REQUEST_CHANGES for this head SHA.

seonghobae · 2026-07-01T10:42:43Z

Pull request overview

OpenCode exhausted the configured model pool without a usable current-head review conclusion. This is not approval evidence, so the PR is blocked until a source-backed review can establish approval sufficiency or identify concrete fixes.

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Problem: every configured model path failed to produce a usable current-head control block.

Root cause: model execution, timeout, export, normalization, or approval-gate validation did not complete after exponential retry across the configured model pool.

Impact: approving from deterministic check state alone would miss PR-intent mismatches, missing files, edge-case bugs, robustness gaps, UX/DX regressions, security issues, and CodeGraph-backed base/head flow changes.

Fix: rerun OpenCode after model availability recovers, or update the PR with the missing files, tests, docs, generated artifacts, and verification evidence needed for a source-backed review conclusion.

Regression test: keep the approval gate posting REQUEST_CHANGES, not APPROVE or check-only failure, when no model produces a valid current-head review.

Summary

Result: REQUEST_CHANGES

Reason: coverage-evidence passed and peer GitHub Checks completed without failures, but no model produced a valid review control block.

Deterministic evidence checked but not used for approval: current-head changed-file evidence (.github/workflows/opencode-review.yml, scripts/ci/opencode_review_approve_gate.sh, scripts/ci/pr_review_fix_scheduler.py, scripts/ci/test_strix_quick_gate.sh, tests/test_opencode_agent_contract.py, tests/test_pr_review_fix_scheduler.py); coverage-evidence result success; peer checks from statusCheckRollup excluding this OpenCode check.

Model outcome: model_pool=exhausted; selected_model=none.

Head SHA: bc4a90fdcc9bc97eb3ede85cacb2e835abe4b3c8

Workflow run: 28508508411

Workflow attempt: 1

No PR approval was posted because model-output failure is not evidence that the PR has no blockers.

Changed-File Evidence Map
flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["CI script (3 files)"]
  S2 --> I2["review and security gate shell path"]
  I2 --> R2["Review risk: CI script (3 files)"]
  R2 --> V2["bash -n plus Strix self-test"]
  Evidence --> S3["Test (2 files)"]
  S3 --> I3["regression suite"]
  I3 --> R3["Review risk: Test (2 files)"]
  R3 --> V3["targeted test run"]
Loading

@copilot 고칩시다.

Copilot AI review requested due to automatic review settings July 1, 2026 06:14

Copilot started reviewing on behalf of seonghobae July 1, 2026 06:15 View session

Copilot AI reviewed Jul 1, 2026

View reviewed changes

Comment thread scripts/ci/pr_review_fix_scheduler.py

seonghobae force-pushed the codex/disable-thread-only-autofix branch from a2c527a to c5351ec Compare July 1, 2026 06:17

opencode-agent Bot previously approved these changes Jul 1, 2026

View reviewed changes

github-actions Bot enabled auto-merge (squash) July 1, 2026 07:07

fix: avoid thread-only review autofix dispatches

b4e4056

seonghobae force-pushed the codex/disable-thread-only-autofix branch from c5351ec to b4e4056 Compare July 1, 2026 07:22

opencode-agent Bot and others added 2 commits July 1, 2026 07:22

Merge branch 'main' into codex/disable-thread-only-autofix

71beed6

fix: support older gh check-run fallback

5cb185f

seonghobae dismissed opencode-agent[bot]’s stale review via 5cb185f July 1, 2026 08:55

opencode-agent Bot disabled auto-merge July 1, 2026 09:20

Merge remote-tracking branch 'origin/main' into codex/disable-thread-…

8c1ae44

…only-autofix # Conflicts: # .github/workflows/opencode-review.yml

github-actions Bot requested changes Jul 1, 2026

View reviewed changes

seonghobae added 2 commits July 1, 2026 18:41

fix: allow long OpenCode review runs

9770142

test: align OpenCode timeout contract

bc4a90f

github-actions Bot requested changes Jul 1, 2026

View reviewed changes

Copilot started work on behalf of seonghobae July 1, 2026 10:43 View session

Copilot finished work on behalf of seonghobae July 1, 2026 10:46

ContextualWisdomLab deleted a comment from Copilot AI Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: avoid thread-only review autofix dispatches#270

fix: avoid thread-only review autofix dispatches#270
seonghobae wants to merge 6 commits into
mainfrom
codex/disable-thread-only-autofix

seonghobae commented Jul 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

opencode-agent Bot commented Jul 1, 2026 •

edited by github-actions Bot

Loading

Uh oh!

opencode-agent Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Jul 1, 2026

Uh oh!

seonghobae commented Jul 1, 2026

Pull request overview

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Summary

Changed-File Evidence Map

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

seonghobae commented Jul 1, 2026

Summary

Why

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

opencode-agent Bot commented Jul 1, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenCode Review Overview

Pull request overview

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Summary

Changed-File Evidence Map

Uh oh!

opencode-agent Bot left a comment

Choose a reason for hiding this comment

Pull request overview

Findings

Summary

Changed-File Evidence Map

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Pull request overview

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Coverage evidence

Coverage Evidence

Python project dependencies (.)

Python coverage with missing-line report (.)

Python docstring coverage advisory

Coverage Decision

Changed-File Evidence Map

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Pull request overview

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Coverage evidence

Coverage Evidence

Python project dependencies (.)

Python coverage with missing-line report (.)

Python docstring coverage advisory

Coverage Decision

Changed-File Evidence Map

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Pull request overview

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Summary

Changed-File Evidence Map

Uh oh!

github-actions Bot Jul 1, 2026

Choose a reason for hiding this comment

HIGH OpenCode could not establish approval sufficiency

Uh oh!

seonghobae commented Jul 1, 2026

Pull request overview

Findings

1. HIGH scripts/ci/pr_review_fix_scheduler.py:1 - OpenCode could not establish approval sufficiency

Summary

Changed-File Evidence Map

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

opencode-agent Bot commented Jul 1, 2026 •

edited by github-actions Bot

Loading