Skip to content

fix(review): order review models by GitHub Models quota, not gpt-5 first#314

Merged
seonghobae merged 1 commit into
mainfrom
fix/review-model-order-by-quota
Jul 5, 2026
Merged

fix(review): order review models by GitHub Models quota, not gpt-5 first#314
seonghobae merged 1 commit into
mainfrom
fix/review-model-order-by-quota

Conversation

@seonghobae

Copy link
Copy Markdown
Contributor

문제 (진단)

#295가 OPENCODE_MODEL_CANDIDATES 1순위를 openai/gpt-5로 바꿈. GitHub Models에서 gpt-5/o3는 "Reasoning" 티어로 일일 쿼터가 가장 작음(812 req/day, 12/min) — rate-limit·hang이 잦음. gpt-5가 1순위 + ATTEMPTS=5 + RUN_TIMEOUT=20400 + 스텝 350분이라, 멈춘 flagship에 스텝 예산 전체를 소진하고 폴백 못 함 → 리뷰가 ~6시간 돌다 실패, 조직 전체 머지 차단.

실측: appguardrail 04:59 리뷰의 model pool 스텝이 5시간+ 정체.

변경 (지시대로 후보 순서만)

쿼터 허용량 큰 순서로 재배치:

  1. 비추론 Low 티어(150~450 req/day): deepseek-v3, mistral-medium, llama-4-maverick/scout
  2. mini 추론(12~20/day): o4-mini, o3-mini, gpt-5-mini/nano/chat
  3. DeepSeek-R1(8~50/day): deepseek-r1-0528, deepseek-r1
  4. flagship(8~12/day, 품질 폴백): o3, gpt-5(최후미)

ATTEMPTS=5, RUN_TIMEOUT=20400, 스텝 timeout-minutes=350은 그대로 둠(요청대로 설정 미변경).

테스트

test_opencode_agent_contract.py를 새 순서 + #295가 방치한 현 설정값(ATTEMPTS 1→5, RUN_TIMEOUT 600→20400, 스텝 285→350)에 맞춰 정합화 → 13개 전부 green(설정값은 변경 아님, 테스트를 현실에 맞춤).

주의

리뷰가 현재 깨진 상태(gpt-5 정체)라 이 PR 자체 리뷰도 막힘 → break-glass 머지 예정.

🤖 Generated with Claude Code

#295 put openai/gpt-5 first in OPENCODE_MODEL_CANDIDATES. On GitHub Models,
gpt-5/o3 are the "Reasoning" tier with the smallest quota (8-12 requests/day,
1-2/min) and hang or rate-limit constantly, so with gpt-5 first the pool burned
the entire 350-min step on a stalled flagship and never fell back — reviews ran
~6 h and failed org-wide (observed: appguardrail review stuck >5 h in the model
pool step, PRs unmergeable).

Reorder candidates by quota allowance, largest first: non-reasoning "Low" tier
(deepseek-v3, mistral-medium, llama-4: 150-450 req/day) then mini-reasoning
(o4-mini, o3-mini, gpt-5-mini/nano/chat) then DeepSeek-R1 then the flagships
(o3, gpt-5) last as quality fallback. Only the candidate order changes;
ATTEMPTS(5), RUN_TIMEOUT(20400), step timeout(350) are left as-is per request.

Reconcile test_opencode_agent_contract.py with the new order and with the
already-current #295 settings it still asserted stale (ATTEMPTS 1->5, RUN_TIMEOUT
600->20400, step 285->350) so the guard test is green and accurate again.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RTAMs4bpSZS77Xe3RQjv9P
@seonghobae seonghobae merged commit b7a1dd0 into main Jul 5, 2026
16 of 17 checks passed

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

OpenCode cannot approve yet because required coverage evidence did not pass.

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

  • Problem: The required coverage-evidence job result was failure, so OpenCode cannot establish approval sufficiency for this head.

  • Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.

  • Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports success with required evidence or explicit no-source not-applicable evidence.

  • Regression test: Keep the approval branch checking needs.coverage-evidence.result == success before posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present.

  • Result: REQUEST_CHANGES

  • Reason: coverage-evidence result was failure, so required test/docstring evidence was not proven for current head 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a.

  • Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a

  • Workflow run: 28737665901

  • Workflow attempt: 1

Coverage evidence

Coverage Evidence

  • Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
  • Required test evidence: supported repository test suites must pass.
  • Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.

Python project dependencies (.)

Using CPython 3.12.3 interpreter at: /usr/bin/python3
Creating virtual environment at: .venv
Resolved 17 packages in 118ms
Downloading pygments (1.2MiB)
 Downloaded pygments
Prepared 13 packages in 100ms
Installed 13 packages in 16ms
 + attrs==26.1.0
 + click==8.4.2
 + colorama==0.4.6
 + coverage==7.15.0
 + iniconfig==2.3.0
 + interrogate==1.7.0
 + packaging==26.2
 + pluggy==1.6.0
 + py==1.11.0
 + pygments==2.20.0
 + pytest==9.1.1
 + pytest-cov==7.1.0
 + tabulate==0.10.0
  • Result: PASS

Python coverage with missing-line report (.)

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.1.1, pluggy-1.6.0
rootdir: /home/runner/work/.github/.github/pr-head
configfile: pyproject.toml
plugins: cov-7.1.0
collected 166 items

tests/test_assert_opencode_reasoning_effort.py ........                  [  4%]
tests/test_codeql_pr_workflow_contract.py .                              [  5%]
tests/test_noema_review_gate.py .......F...                              [ 12%]
tests/test_opencode_agent_contract.py .............                      [ 19%]
tests/test_opencode_review_normalize_output.py ......................... [ 34%]
                                                                         [ 34%]
tests/test_opencode_workflow_shell_syntax.py .                           [ 35%]
tests/test_pr_governance_audit_contract.py ...                           [ 37%]
tests/test_pr_review_fix_scheduler.py ...................                [ 48%]
tests/test_pr_review_fix_scheduler_coverage.py ..                        [ 50%]
tests/test_pr_review_merge_scheduler.py ................................ [ 69%]
..............................                                           [ 87%]
tests/test_render_opencode_prompt_template.py ....                       [ 89%]
tests/test_review_execution_contracts.py ..                              [ 90%]
tests/test_sandboxed_verify.py .........                                 [ 96%]
tests/test_sandboxed_web_e2e.py ......                                   [100%]

=================================== FAILURES ===================================
_______________ test_call_llm_handles_configuration_and_verdicts _______________

monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f77ba18bd40>

    def test_call_llm_handles_configuration_and_verdicts(monkeypatch):
        pr = make_pr()
        monkeypatch.delenv("NOEMA_LLM_API_URL", raising=False)
        monkeypatch.delenv("NOEMA_LLM_API_KEY", raising=False)
        assert noema.call_llm("owner/repo", 1, pr, "diff", False) is None
    
        monkeypatch.setenv("NOEMA_LLM_API_URL", "file:///etc/passwd")
        monkeypatch.setenv("NOEMA_LLM_API_KEY", "secret")
>       with pytest.raises(ValueError, match="must start with http:// or https://"):
E       AssertionError: Regex pattern did not match.
E         Expected regex: 'must start with http:// or https://'
E         Actual message: 'URL scheme must be http or https'

tests/test_noema_review_gate.py:209: AssertionError
----------------------------- Captured stdout call -----------------------------
Noema LLM review unavailable: NOEMA_LLM_API_URL or NOEMA_LLM_API_KEY is not configured.
=============================== warnings summary ===============================
tests/test_assert_opencode_reasoning_effort.py::test_module_entrypoint_success
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.assert_opencode_reasoning_effort' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.assert_opencode_reasoning_effort'; this may result in unpredictable behaviour

tests/test_render_opencode_prompt_template.py::test_module_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.render_opencode_prompt_template' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.render_opencode_prompt_template'; this may result in unpredictable behaviour

tests/test_review_execution_contracts.py::test_discovers_package_managers_java_r_json_and_main
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.review_execution_contracts' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.review_execution_contracts'; this may result in unpredictable behaviour

tests/test_sandboxed_verify.py::test_module_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_verify' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_verify'; this may result in unpredictable behaviour

tests/test_sandboxed_web_e2e.py::test_module_import_and_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_web_e2e' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_web_e2e'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_noema_review_gate.py::test_call_llm_handles_configuration_and_verdicts - AssertionError: Regex pattern did not match.
  Expected regex: 'must start with http:// or https://'
  Actual message: 'URL scheme must be http or https'
================== 1 failed, 165 passed, 5 warnings in 5.85s ===================
  • Result: FAIL (exit 1)

Python docstring coverage advisory

RESULT: PASSED (minimum: 100.0%, actual: 100.0%)
  • Result: PASS

Coverage Decision

  • Result: FAIL
  • Test evidence: not proven passing
  • Docstring evidence: not proven passing when configured
  • Failure count: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["Test: test_opencode_agent_contract.py"]
  S2 --> I2["regression suite"]
  I2 --> R2["Review risk: Test: test_opencode_agent_contract.py"]
  R2 --> V2["targeted test run"]
Loading

@github-actions

github-actions Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

OpenCode Review Overview

  • Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
  • Workflow run: 28737665901
  • Workflow attempt: 1
  • Gate result: REQUEST_CHANGES (approval step)

Pull request overview

OpenCode cannot approve yet because required coverage evidence did not pass.

Review outcome

1. HIGH .github/workflows/opencode-review.yml:1 - Coverage evidence did not prove required test/docstring evidence

  • Problem: The required coverage-evidence job result was failure, so OpenCode cannot establish approval sufficiency for this head.

  • Root cause: Automated approval is only valid when the same-head coverage-evidence job proves supported repository test suites passed and configured docstring gates passed or were advisory, or reports not applicable because no supported source files or package manifests exist. Missing, failed, skipped, unavailable, or unsupported-tooling test evidence is a blocker.

  • Fix: Install or configure the repository test/docstring evidence tooling when source files or package manifests exist, rerun the current-head coverage-evidence job, and approve only after it reports success with required evidence or explicit no-source not-applicable evidence.

  • Regression test: Keep the approval branch checking needs.coverage-evidence.result == success before posting APPROVE, and publish REQUEST_CHANGES when coverage-evidence blocker states such as cancelled, skipped, failed, unsupported-tooling, or below-100 evidence are present.

  • Result: REQUEST_CHANGES

  • Reason: coverage-evidence result was failure, so required test/docstring evidence was not proven for current head 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a.

  • Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a

  • Workflow run: 28737665901

  • Workflow attempt: 1

Coverage evidence

Coverage Evidence

  • Head SHA: 38ac3ebe6f2ad5c6a2b6a697a343a9314ee20a8a
  • Required test evidence: supported repository test suites must pass.
  • Required docstring evidence: repository-owned docstring gates must pass when configured; otherwise docstring coverage is advisory.

Python project dependencies (.)

Using CPython 3.12.3 interpreter at: /usr/bin/python3
Creating virtual environment at: .venv
Resolved 17 packages in 118ms
Downloading pygments (1.2MiB)
 Downloaded pygments
Prepared 13 packages in 100ms
Installed 13 packages in 16ms
 + attrs==26.1.0
 + click==8.4.2
 + colorama==0.4.6
 + coverage==7.15.0
 + iniconfig==2.3.0
 + interrogate==1.7.0
 + packaging==26.2
 + pluggy==1.6.0
 + py==1.11.0
 + pygments==2.20.0
 + pytest==9.1.1
 + pytest-cov==7.1.0
 + tabulate==0.10.0
  • Result: PASS

Python coverage with missing-line report (.)

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.1.1, pluggy-1.6.0
rootdir: /home/runner/work/.github/.github/pr-head
configfile: pyproject.toml
plugins: cov-7.1.0
collected 166 items

tests/test_assert_opencode_reasoning_effort.py ........                  [  4%]
tests/test_codeql_pr_workflow_contract.py .                              [  5%]
tests/test_noema_review_gate.py .......F...                              [ 12%]
tests/test_opencode_agent_contract.py .............                      [ 19%]
tests/test_opencode_review_normalize_output.py ......................... [ 34%]
                                                                         [ 34%]
tests/test_opencode_workflow_shell_syntax.py .                           [ 35%]
tests/test_pr_governance_audit_contract.py ...                           [ 37%]
tests/test_pr_review_fix_scheduler.py ...................                [ 48%]
tests/test_pr_review_fix_scheduler_coverage.py ..                        [ 50%]
tests/test_pr_review_merge_scheduler.py ................................ [ 69%]
..............................                                           [ 87%]
tests/test_render_opencode_prompt_template.py ....                       [ 89%]
tests/test_review_execution_contracts.py ..                              [ 90%]
tests/test_sandboxed_verify.py .........                                 [ 96%]
tests/test_sandboxed_web_e2e.py ......                                   [100%]

=================================== FAILURES ===================================
_______________ test_call_llm_handles_configuration_and_verdicts _______________

monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f77ba18bd40>

    def test_call_llm_handles_configuration_and_verdicts(monkeypatch):
        pr = make_pr()
        monkeypatch.delenv("NOEMA_LLM_API_URL", raising=False)
        monkeypatch.delenv("NOEMA_LLM_API_KEY", raising=False)
        assert noema.call_llm("owner/repo", 1, pr, "diff", False) is None
    
        monkeypatch.setenv("NOEMA_LLM_API_URL", "file:///etc/passwd")
        monkeypatch.setenv("NOEMA_LLM_API_KEY", "secret")
>       with pytest.raises(ValueError, match="must start with http:// or https://"):
E       AssertionError: Regex pattern did not match.
E         Expected regex: 'must start with http:// or https://'
E         Actual message: 'URL scheme must be http or https'

tests/test_noema_review_gate.py:209: AssertionError
----------------------------- Captured stdout call -----------------------------
Noema LLM review unavailable: NOEMA_LLM_API_URL or NOEMA_LLM_API_KEY is not configured.
=============================== warnings summary ===============================
tests/test_assert_opencode_reasoning_effort.py::test_module_entrypoint_success
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.assert_opencode_reasoning_effort' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.assert_opencode_reasoning_effort'; this may result in unpredictable behaviour

tests/test_render_opencode_prompt_template.py::test_module_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.render_opencode_prompt_template' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.render_opencode_prompt_template'; this may result in unpredictable behaviour

tests/test_review_execution_contracts.py::test_discovers_package_managers_java_r_json_and_main
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.review_execution_contracts' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.review_execution_contracts'; this may result in unpredictable behaviour

tests/test_sandboxed_verify.py::test_module_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_verify' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_verify'; this may result in unpredictable behaviour

tests/test_sandboxed_web_e2e.py::test_module_import_and_main_entrypoint
  <frozen runpy>:128: RuntimeWarning: 'scripts.ci.sandboxed_web_e2e' found in sys.modules after import of package 'scripts.ci', but prior to execution of 'scripts.ci.sandboxed_web_e2e'; this may result in unpredictable behaviour

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_noema_review_gate.py::test_call_llm_handles_configuration_and_verdicts - AssertionError: Regex pattern did not match.
  Expected regex: 'must start with http:// or https://'
  Actual message: 'URL scheme must be http or https'
================== 1 failed, 165 passed, 5 warnings in 5.85s ===================
  • Result: FAIL (exit 1)

Python docstring coverage advisory

RESULT: PASSED (minimum: 100.0%, actual: 100.0%)
  • Result: PASS

Coverage Decision

  • Result: FAIL
  • Test evidence: not proven passing
  • Docstring evidence: not proven passing when configured
  • Failure count: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["Workflow: opencode-review.yml"]
  S1 --> I1["GitHub Actions review job"]
  I1 --> R1["Review risk: Workflow: opencode-review.yml"]
  R1 --> V1["actionlint plus required checks"]
  Evidence --> S2["Test: test_opencode_agent_contract.py"]
  S2 --> I2["regression suite"]
  I2 --> R2["Review risk: Test: test_opencode_agent_contract.py"]
  R2 --> V2["targeted test run"]
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant