fix(review): order review models by GitHub Models quota, not gpt-5 first by seonghobae · Pull Request #314 · ContextualWisdomLab/.github

seonghobae · 2026-07-05T10:25:54Z

문제 (진단)

#295가 OPENCODE_MODEL_CANDIDATES 1순위를 openai/gpt-5로 바꿈. GitHub Models에서 gpt-5/o3는 "Reasoning" 티어로 일일 쿼터가 가장 작음(8~~12 req/day, 1~~2/min) — rate-limit·hang이 잦음. gpt-5가 1순위 + ATTEMPTS=5 + RUN_TIMEOUT=20400 + 스텝 350분이라, 멈춘 flagship에 스텝 예산 전체를 소진하고 폴백 못 함 → 리뷰가 ~6시간 돌다 실패, 조직 전체 머지 차단.

실측: appguardrail 04:59 리뷰의 model pool 스텝이 5시간+ 정체.

변경 (지시대로 후보 순서만)

쿼터 허용량 큰 순서로 재배치:

비추론 Low 티어(150~450 req/day): deepseek-v3, mistral-medium, llama-4-maverick/scout
mini 추론(12~20/day): o4-mini, o3-mini, gpt-5-mini/nano/chat
DeepSeek-R1(8~50/day): deepseek-r1-0528, deepseek-r1
flagship(8~12/day, 품질 폴백): o3, gpt-5(최후미)

ATTEMPTS=5, RUN_TIMEOUT=20400, 스텝 timeout-minutes=350은 그대로 둠(요청대로 설정 미변경).

테스트

test_opencode_agent_contract.py를 새 순서 + #295가 방치한 현 설정값(ATTEMPTS 1→5, RUN_TIMEOUT 600→20400, 스텝 285→350)에 맞춰 정합화 → 13개 전부 green(설정값은 변경 아님, 테스트를 현실에 맞춤).

주의

리뷰가 현재 깨진 상태(gpt-5 정체)라 이 PR 자체 리뷰도 막힘 → break-glass 머지 예정.

🤖 Generated with Claude Code

#295 put openai/gpt-5 first in OPENCODE_MODEL_CANDIDATES. On GitHub Models, gpt-5/o3 are the "Reasoning" tier with the smallest quota (8-12 requests/day, 1-2/min) and hang or rate-limit constantly, so with gpt-5 first the pool burned the entire 350-min step on a stalled flagship and never fell back — reviews ran ~6 h and failed org-wide (observed: appguardrail review stuck >5 h in the model pool step, PRs unmergeable). Reorder candidates by quota allowance, largest first: non-reasoning "Low" tier (deepseek-v3, mistral-medium, llama-4: 150-450 req/day) then mini-reasoning (o4-mini, o3-mini, gpt-5-mini/nano/chat) then DeepSeek-R1 then the flagships (o3, gpt-5) last as quality fallback. Only the candidate order changes; ATTEMPTS(5), RUN_TIMEOUT(20400), step timeout(350) are left as-is per request. Reconcile test_opencode_agent_contract.py with the new order and with the already-current #295 settings it still asserted stale (ATTEMPTS 1->5, RUN_TIMEOUT 600->20400, step 285->350) so the guard test is green and accurate again. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RTAMs4bpSZS77Xe3RQjv9P

github-actions

Pull request overview

OpenCode cannot approve yet because required coverage evidence did not pass.

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Problem: The required coverage-evidence job result was failure, so OpenCode cannot establish approval sufficiency for this head.
Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.
Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports success with required evidence or explicit no-source not-applicable evidence.
Regression test: Keep the approval branch checking needs.coverage-evidence.result == success before posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present.
Result: REQUEST_CHANGES
Reason: coverage-evidence result was failure, so required test/docstring evidence was not proven for current head 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a.
Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
Workflow run: 28737665901
Workflow attempt: 1

Coverage evidence

Coverage Evidence

Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
Required test evidence: supported repository test suites must pass.
Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.

Python project dependencies (.)

Using CPython 3.12.3 interpreter at: /usr/bin/python3
Creating virtual environment at: .venv
Resolved 17 packages in 118ms
Downloading pygments (1.2MiB)
 Downloaded pygments
Prepared 13 packages in 100ms
Installed 13 packages in 16ms
 + attrs==26.1.0
 + click==8.4.2
 + colorama==0.4.6
 + coverage==7.15.0
 + iniconfig==2.3.0
 + interrogate==1.7.0
 + packaging==26.2
 + pluggy==1.6.0
 + py==1.11.0
 + pygments==2.20.0
 + pytest==9.1.1
 + pytest-cov==7.1.0
 + tabulate==0.10.0

Result: PASS

Python coverage with missing-line report (.)

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.1.1, pluggy-1.6.0
rootdir: /home/runner/work/.github/.github/pr-head
configfile: pyproject.toml
plugins: cov-7.1.0
collected 166 items

tests/test_assert_opencode_reasoning_effort.py ........                  [  4%]
tests/test_codeql_pr_workflow_contract.py .                              [  5%]
tests/test_noema_review_gate.py .......F...                              [ 12%]
tests/test_opencode_agent_contract.py .............                      [ 19%]
tests/test_opencode_review_normalize_output.py ......................... [ 34%]
                                                                         [ 34%]
tests/test_opencode_workflow_shell_syntax.py .                           [ 35%]
tests/test_pr_governance_audit_contract.py ...                           [ 37%]
tests/test_pr_review_fix_scheduler.py ...................                [ 48%]
tests/test_pr_review_fix_scheduler_coverage.py ..                        [ 50%]
tests/test_pr_review_merge_scheduler.py ................................ [ 69%]
..............................                                           [ 87%]
tests/test_render_opencode_prompt_template.py ....                       [ 89%]
tests/test_review_execution_contracts.py ..                              [ 90%]
tests/test_sandboxed_verify.py .........                                 [ 96%]
tests/test_sandboxed_web_e2e.py ......                                   [100%]

=================================== FAILURES ===================================
_______________ test_call_llm_handles_configuration_and_verdicts _______________

monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f77ba18bd40>

    def test_call_llm_handles_configuration_and_verdicts(monkeypatch):
        pr = make_pr()
        monkeypatch.delenv("NOEMA_LLM_API_URL", raising=False)
        monkeypatch.delenv("NOEMA_LLM_API_KEY", raising=False)
        assert noema.call_llm("owner/repo", 1, pr, "diff", False) is None
    
        monkeypatch.setenv("NOEMA_LLM_API_URL", "file:///etc/passwd")
        monkeypatch.setenv("NOEMA_LLM_API_KEY", "secret")
>       with pytest.raises(ValueError, match="must start with http:// or https://"):
E       AssertionError: Regex pattern did not match.
E         Expected regex: 'must start with http:// or https://'
E         Actual message: 'URL scheme must be http or https'

tests/test_noema_review_gate.py:209: AssertionError
----------------------------- Captured stdout call -----------------------------
Noema LLM review unavailable: NOEMA_LLM_API_URL or NOEMA_LLM_API_KEY is not configured.
=============================== warnings summary ===============================
tests/test_assert_opencode_reasoning_effort.py::test_module_entrypoint_success
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.assert_opencode_reasoning_effort' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.assert_opencode_reasoning_effort'; this may result in unpredictable behaviour

tests/test_render_opencode_prompt_template.py::test_module_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.render_opencode_prompt_template' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.render_opencode_prompt_template'; this may result in unpredictable behaviour

tests/test_review_execution_contracts.py::test_discovers_package_managers_java_r_json_and_main
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.review_execution_contracts' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.review_execution_contracts'; this may result in unpredictable behaviour

tests/test_sandboxed_verify.py::test_module_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_verify' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_verify'; this may result in unpredictable behaviour

tests/test_sandboxed_web_e2e.py::test_module_import_and_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_web_e2e' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_web_e2e'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_noema_review_gate.py::test_call_llm_handles_configuration_and_verdicts - AssertionError: Regex pattern did not match.
  Expected regex: 'must start with http:// or https://'
  Actual message: 'URL scheme must be http or https'
================== 1 failed, 165 passed, 5 warnings in 5.85s ===================

Result: FAIL (exit 1)

Python docstring coverage advisory

RESULT: PASSED (minimum: 100.0%, actual: 100.0%)

Result: PASS

Coverage Decision

Result: FAIL
Test evidence: not proven passing
Docstring evidence: not proven passing when configured
Failure count: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["Test: test_opencode_agent_contract.py"]
  S2 --> I2["regression suite"]
  I2 --> R2["Review risk: Test: test_opencode_agent_contract.py"]
  R2 --> V2["targeted test run"]

github-actions · 2026-07-05T10:32:50Z

OpenCode Review Overview

Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
Workflow run: 28737665901
Workflow attempt: 1
Gate result: REQUEST_CHANGES (approval step)

Pull request overview

OpenCode cannot approve yet because required coverage evidence did not pass.

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Problem: The required coverage-evidence job result was failure, so OpenCode cannot establish approval sufficiency for this head.
Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.
Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports success with required evidence or explicit no-source not-applicable evidence.
Regression test: Keep the approval branch checking needs.coverage-evidence.result == success before posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present.
Result: REQUEST_CHANGES
Reason: coverage-evidence result was failure, so required test/docstring evidence was not proven for current head 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a.
Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
Workflow run: 28737665901
Workflow attempt: 1

Coverage evidence

Coverage Evidence

Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
Required test evidence: supported repository test suites must pass.
Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.

Python project dependencies (.)

Using CPython 3.12.3 interpreter at: /usr/bin/python3
Creating virtual environment at: .venv
Resolved 17 packages in 118ms
Downloading pygments (1.2MiB)
 Downloaded pygments
Prepared 13 packages in 100ms
Installed 13 packages in 16ms
 + attrs==26.1.0
 + click==8.4.2
 + colorama==0.4.6
 + coverage==7.15.0
 + iniconfig==2.3.0
 + interrogate==1.7.0
 + packaging==26.2
 + pluggy==1.6.0
 + py==1.11.0
 + pygments==2.20.0
 + pytest==9.1.1
 + pytest-cov==7.1.0
 + tabulate==0.10.0

Result: PASS

Python coverage with missing-line report (.)

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.1.1, pluggy-1.6.0
rootdir: /home/runner/work/.github/.github/pr-head
configfile: pyproject.toml
plugins: cov-7.1.0
collected 166 items

tests/test_assert_opencode_reasoning_effort.py ........                  [  4%]
tests/test_codeql_pr_workflow_contract.py .                              [  5%]
tests/test_noema_review_gate.py .......F...                              [ 12%]
tests/test_opencode_agent_contract.py .............                      [ 19%]
tests/test_opencode_review_normalize_output.py ......................... [ 34%]
                                                                         [ 34%]
tests/test_opencode_workflow_shell_syntax.py .                           [ 35%]
tests/test_pr_governance_audit_contract.py ...                           [ 37%]
tests/test_pr_review_fix_scheduler.py ...................                [ 48%]
tests/test_pr_review_fix_scheduler_coverage.py ..                        [ 50%]
tests/test_pr_review_merge_scheduler.py ................................ [ 69%]
..............................                                           [ 87%]
tests/test_render_opencode_prompt_template.py ....                       [ 89%]
tests/test_review_execution_contracts.py ..                              [ 90%]
tests/test_sandboxed_verify.py .........                                 [ 96%]
tests/test_sandboxed_web_e2e.py ......                                   [100%]

=================================== FAILURES ===================================
_______________ test_call_llm_handles_configuration_and_verdicts _______________

monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f77ba18bd40>

    def test_call_llm_handles_configuration_and_verdicts(monkeypatch):
        pr = make_pr()
        monkeypatch.delenv("NOEMA_LLM_API_URL", raising=False)
        monkeypatch.delenv("NOEMA_LLM_API_KEY", raising=False)
        assert noema.call_llm("owner/repo", 1, pr, "diff", False) is None
    
        monkeypatch.setenv("NOEMA_LLM_API_URL", "file:///etc/passwd")
        monkeypatch.setenv("NOEMA_LLM_API_KEY", "secret")
>       with pytest.raises(ValueError, match="must start with http:// or https://"):
E       AssertionError: Regex pattern did not match.
E         Expected regex: 'must start with http:// or https://'
E         Actual message: 'URL scheme must be http or https'

tests/test_noema_review_gate.py:209: AssertionError
----------------------------- Captured stdout call -----------------------------
Noema LLM review unavailable: NOEMA_LLM_API_URL or NOEMA_LLM_API_KEY is not configured.
=============================== warnings summary ===============================
tests/test_assert_opencode_reasoning_effort.py::test_module_entrypoint_success
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.assert_opencode_reasoning_effort' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.assert_opencode_reasoning_effort'; this may result in unpredictable behaviour

tests/test_render_opencode_prompt_template.py::test_module_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.render_opencode_prompt_template' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.render_opencode_prompt_template'; this may result in unpredictable behaviour

tests/test_review_execution_contracts.py::test_discovers_package_managers_java_r_json_and_main
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.review_execution_contracts' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.review_execution_contracts'; this may result in unpredictable behaviour

tests/test_sandboxed_verify.py::test_module_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_verify' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_verify'; this may result in unpredictable behaviour

tests/test_sandboxed_web_e2e.py::test_module_import_and_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_web_e2e' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_web_e2e'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_noema_review_gate.py::test_call_llm_handles_configuration_and_verdicts - AssertionError: Regex pattern did not match.
  Expected regex: 'must start with http:// or https://'
  Actual message: 'URL scheme must be http or https'
================== 1 failed, 165 passed, 5 warnings in 5.85s ===================

Result: FAIL (exit 1)

Python docstring coverage advisory

RESULT: PASSED (minimum: 100.0%, actual: 100.0%)

Result: PASS

Coverage Decision

Result: FAIL
Test evidence: not proven passing
Docstring evidence: not proven passing when configured
Failure count: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["Test: test_opencode_agent_contract.py"]
  S2 --> I2["regression suite"]
  I2 --> R2["Review risk: Test: test_opencode_agent_contract.py"]
  R2 --> V2["targeted test run"]

seonghobae merged commit b7a1dd0 into main Jul 5, 2026
16 of 17 checks passed

github-actions Bot requested changes Jul 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(review): order review models by GitHub Models quota, not gpt-5 first#314

fix(review): order review models by GitHub Models quota, not gpt-5 first#314
seonghobae merged 1 commit into
mainfrom
fix/review-model-order-by-quota

seonghobae commented Jul 5, 2026

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot commented Jul 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

seonghobae commented Jul 5, 2026

문제 (진단)

변경 (지시대로 후보 순서만)

테스트

주의

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Pull request overview

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Coverage evidence

Coverage Evidence

Python project dependencies (.)

Python coverage with missing-line report (.)

Python docstring coverage advisory

Coverage Decision

Changed-File Evidence Map

Uh oh!

github-actions Bot commented Jul 5, 2026

OpenCode Review Overview

Pull request overview

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

Coverage evidence

Coverage Evidence

Python project dependencies (.)

Python coverage with missing-line report (.)

Python docstring coverage advisory

Coverage Decision

Changed-File Evidence Map

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant