fix(review): validate model output on a copy so publish re-normalize succeeds by seonghobae · Pull Request #316 · ContextualWisdomLab/.github

seonghobae · 2026-07-05T12:57:39Z

정정 (이전 #315의 미완 수정 보완)

opencode_review_normalize_output.py는 입력 파일을 in-place로 rewrite하며 멱등적이지 않습니다. 모델 풀이 성공 판정 시 그 파일을 normalize(변형)한 뒤 그대로 publish 스텝에 넘기면, publish가 변형된 내용을 2차 normalize→실패 → "Selected successful OpenCode output did not include a valid control conclusion"로 리뷰 실패(모델 순환도 안 됨).

실측 재현: deepseek-v3 attempt 1·2 실패, attempt 3이 풀 normalize 통과(변형)→성공 기록→중단, 이후 publish가 변형본 재검증→실패.

수정

normalize_opencode_output가 ANSI 제거한 복사본(probe)에서 normalize+approve_gate 검증하고 원본은 pristine 유지. publish 스텝이 그 원본을 유일하게 1회 normalize하므로 풀 판정과 일치. 로컬에서 비멱등 normalizer 스텁으로 검증(풀 통과·원본 비변형·publish 통과).

범위: 순환/검증 정합만. 모델 순서·타임아웃·ATTEMPTS 미변경.

🤖 Generated with Claude Code

…succeeds Follow-up to the pool-cycling fix. opencode_review_normalize_output.py rewrites its input in place and is NOT idempotent, and the publish step normalizes the selected output again. The pool was normalizing (mutating) the very file it then handed to the publish step, so a model whose FIRST normalize passed (recorded as the pool's success) failed the publish step's SECOND normalize — ending the run in "Selected successful OpenCode output did not include a valid control conclusion" instead of the review completing. Normalize/approve-gate a throwaway ANSI-stripped copy and leave the model output pristine, so the publish step performs the one-and-only normalize of that content and its result matches the pool's decision. Verified locally against a non-idempotent normalizer stub: pool passes, output stays pristine, publish normalize passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RTAMs4bpSZS77Xe3RQjv9P

opencode-agent

Pull request overview

OpenCode reviewed the current-head bounded evidence and found no blocking issues.

Findings

No blocking findings.

Summary

Approval sufficiency: bounded evidence supplied affirmative approval evidence for changed files, coverage/docstring posture, risk surfaces, and current-head verification; approval is not based merely on the absence of known blockers.
Verification posture: CodeGraph evidence was initialized and bounded current-head evidence reviewed for changed-file evidence including scripts/ci/run_opencode_review_model_pool.sh.
Linter/static: workflow/static review evidence is bounded by the current-head GitHub Checks gate and changed-file evidence.
TDD/regression: coverage execution evidence and focused changed hunks were reviewed from bounded-review-evidence.md.
Coverage: coverage execution evidence reports test coverage as not applicable because no supported changed source files or package manifests were found.
Docstring coverage: coverage execution evidence reports docstring coverage as not applicable because no supported changed source files or package manifests were found.
DAG: CodeGraph/source-backed behavior map connects scripts/ci/run_opencode_review_model_pool.sh to the affected review, runtime, or workflow path and required checks.
PoC/execution: coverage-evidence job executed on the current head and reported PASS.
DDD/domain: workflow and repository-governance invariants were reviewed against changed files in bounded evidence.
CDD/context: CodeGraph evidence, changed-file history, and focused hunks were reviewed from bounded-review-evidence.md.
Similar issues: changed-file history evidence was reviewed for comparable local precedents.
Claim/concept check: bounded evidence, repository source, current-head workflow evidence, and, where numeric, scientific, statistical, or literature-backed claims are affected, original-paper/formula evidence and parameter-recovery expectations were used for claims.
Standards search: standards and external-source checks are delegated to configured OpenCode web_search/Context7/DeepWiki sources when applicable; no evidence-backed standards blocker is present in bounded evidence.
Compatibility/convention: changed workflow/script conventions, object naming, and reserved-word safety for schema/API/config/code surfaces were checked in bounded evidence.
Breaking-change/backcompat: deployment evidence and changed-file history were checked for backward-compatibility risk.
Performance: changed surfaces were checked for performance risk in bounded evidence.
Developer experience: changed automation, review, test, setup, and maintenance surfaces were checked for helpful or obstructive DX impact in bounded evidence.
User experience: connected user, operator, API, CLI, documentation, review-comment, status-check, rendering, and workflow-reader behavior was checked for contradictions against code, docs, and tests in bounded evidence.
Visual/DOM: Playwright visual, DOM locator, ARIA snapshot, console, and responsive evidence were checked when a web UI surface was present; for non-web surfaces, API/CLI/log/docs/workflow interaction evidence was reviewed instead.
Accessibility/i18n: accessibility, localization, and human-readable text surfaces were checked where UI, CLI, API message, docs, logs, or review text changed.
Supply-chain/license: dependency, package, model, container, and external-tool changes were checked in bounded evidence.
Packaging: package, build, test, lint, and security contracts were checked in bounded evidence.
Security/privacy: workflow-token, review-gate, and repository-automation security/privacy boundaries were checked in bounded evidence.

Result: APPROVE
Reason: PR fixes in-place normalization issue by validating a throwaway copy
Head SHA: 62af044d3d7819dc983d9bb731235264c0c9f8ef
Workflow run: 28741550074
Workflow attempt: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["CI script: run_opencode_review_model_pool.sh"]
  S1 --> I1["review and security gate shell path"]
  I1 --> R1["Review risk: CI script: run_opencode_review_model_pool.sh"]
  R1 --> V1["bash -n plus Strix self-test"]

github-actions · 2026-07-05T13:09:50Z

OpenCode Review Overview

Head SHA: 62af044d3d7819dc983d9bb731235264c0c9f8ef
Workflow run: 28741550074
Workflow attempt: 1
Gate result: APPROVE (approval step)

Pull request overview

OpenCode reviewed the current-head bounded evidence and found no blocking issues.

Findings

No blocking findings.

Summary

Approval sufficiency: bounded evidence supplied affirmative approval evidence for changed files, coverage/docstring posture, risk surfaces, and current-head verification; approval is not based merely on the absence of known blockers.
Verification posture: CodeGraph evidence was initialized and bounded current-head evidence reviewed for changed-file evidence including scripts/ci/run_opencode_review_model_pool.sh.
Linter/static: workflow/static review evidence is bounded by the current-head GitHub Checks gate and changed-file evidence.
TDD/regression: coverage execution evidence and focused changed hunks were reviewed from bounded-review-evidence.md.
Coverage: coverage execution evidence reports test coverage as not applicable because no supported changed source files or package manifests were found.
Docstring coverage: coverage execution evidence reports docstring coverage as not applicable because no supported changed source files or package manifests were found.
DAG: CodeGraph/source-backed behavior map connects scripts/ci/run_opencode_review_model_pool.sh to the affected review, runtime, or workflow path and required checks.
PoC/execution: coverage-evidence job executed on the current head and reported PASS.
DDD/domain: workflow and repository-governance invariants were reviewed against changed files in bounded evidence.
CDD/context: CodeGraph evidence, changed-file history, and focused hunks were reviewed from bounded-review-evidence.md.
Similar issues: changed-file history evidence was reviewed for comparable local precedents.
Claim/concept check: bounded evidence, repository source, current-head workflow evidence, and, where numeric, scientific, statistical, or literature-backed claims are affected, original-paper/formula evidence and parameter-recovery expectations were used for claims.
Standards search: standards and external-source checks are delegated to configured OpenCode web_search/Context7/DeepWiki sources when applicable; no evidence-backed standards blocker is present in bounded evidence.
Compatibility/convention: changed workflow/script conventions, object naming, and reserved-word safety for schema/API/config/code surfaces were checked in bounded evidence.
Breaking-change/backcompat: deployment evidence and changed-file history were checked for backward-compatibility risk.
Performance: changed surfaces were checked for performance risk in bounded evidence.
Developer experience: changed automation, review, test, setup, and maintenance surfaces were checked for helpful or obstructive DX impact in bounded evidence.
User experience: connected user, operator, API, CLI, documentation, review-comment, status-check, rendering, and workflow-reader behavior was checked for contradictions against code, docs, and tests in bounded evidence.
Visual/DOM: Playwright visual, DOM locator, ARIA snapshot, console, and responsive evidence were checked when a web UI surface was present; for non-web surfaces, API/CLI/log/docs/workflow interaction evidence was reviewed instead.
Accessibility/i18n: accessibility, localization, and human-readable text surfaces were checked where UI, CLI, API message, docs, logs, or review text changed.
Supply-chain/license: dependency, package, model, container, and external-tool changes were checked in bounded evidence.
Packaging: package, build, test, lint, and security contracts were checked in bounded evidence.
Security/privacy: workflow-token, review-gate, and repository-automation security/privacy boundaries were checked in bounded evidence.

Result: APPROVE
Reason: PR fixes in-place normalization issue by validating a throwaway copy
Head SHA: 62af044d3d7819dc983d9bb731235264c0c9f8ef
Workflow run: 28741550074
Workflow attempt: 1

Changed-File Evidence Map

flowchart LR
  PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
  Evidence --> S1["CI script: run_opencode_review_model_pool.sh"]
  S1 --> I1["review and security gate shell path"]
  I1 --> R1["Review risk: CI script: run_opencode_review_model_pool.sh"]
  R1 --> V1["bash -n plus Strix self-test"]

seonghobae merged commit 2df3683 into main Jul 5, 2026
12 checks passed

opencode-agent Bot approved these changes Jul 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(review): validate model output on a copy so publish re-normalize succeeds#316

fix(review): validate model output on a copy so publish re-normalize succeeds#316
seonghobae merged 1 commit into
mainfrom
fix/review-pool-pristine-output

seonghobae commented Jul 5, 2026

Uh oh!

Uh oh!

opencode-agent Bot left a comment

Uh oh!

github-actions Bot commented Jul 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

seonghobae commented Jul 5, 2026

정정 (이전 #315의 미완 수정 보완)

수정

Uh oh!

Uh oh!

opencode-agent Bot left a comment

Choose a reason for hiding this comment

Pull request overview

Findings

Summary

Changed-File Evidence Map

Uh oh!

github-actions Bot commented Jul 5, 2026

OpenCode Review Overview

Pull request overview

Findings

Summary

Changed-File Evidence Map

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant