feat: Goal Mode works for non-code agentic goals (research, analysis, explanation)#426
Conversation
…mplete
The code-only review gates (goal-diff-reviewer, goal-verifier) were always
required, so a research/analysis/explanation/planning goal — which produces a
text + evidence deliverable and no diff — could never satisfy them, and the
reviewers blocked on an empty git diff (clean working tree -> FAIL).
- gates.js: a goal is 'code-bearing' only once it is dirty, edits a file, or
records changed files; otherwise the diff/verifier gates are dropped and the
deliverable is gated on evidence by the always-on reviewers (prompt-auditor,
reviewer, final-auditor).
- New CONFIG/CONFIG_DOCS key requireCodeReview ('auto' | 'always' | 'never',
default auto) + GOAL_GUARD_REQUIRE_CODE_REVIEW to override the detection.
- Tests for code/non-code gate selection, the override, and a non-code goal
completing on the always-on gates alone.
- Reviewer prompts (prompt-auditor, reviewer, final-auditor, diff-reviewer, verifier): a clean working tree / empty git diff is expected for a non-code goal and is not, by itself, a FAIL; evaluate the recorded evidence and the delivered answer against the acceptance criteria instead. - goal.md: non-code goals are first-class — record the produced answer as evidence (with the criteria it covers); the code-only gates are not required. - README + ARCHITECTURE: document non-code agentic support and requireCodeReview.
devinoldenburg
left a comment
There was a problem hiding this comment.
Adversarial review: APPROVE (no enforcement weakening found)
Verified by running probes against gates.js/events.js/state.js and the full suite (node --test tests/*.test.mjs → 672 pass; npm run lint exit 0; validate-opencode-config.mjs passes).
1. No bypass on code goals — CONFIRMED
isCodeBearing (gates.js:27) is true if dirty OR lastEditSeq>0 OR changedFiles.length>0. The lastEditSeq>0 term is the lock: markEdit (events.js:79) sets it via store.nextSeq() which is seq+=1 (store.js:136) — strictly monotonic, restored as Math.max across persistence (store.js:202), and never reset. markEdit is the only writer of lastEditSeq and is called on every edit tool / bash mutation / file.edited event.
Edit-then-revert/commit scenario probed directly: after one edit lastEditSeq=1; then forcing dirty=false, changedFiles=[] (simulating a clean tree) → isCodeBearing STAYS true and requiredGates STILL returns goal-diff-reviewer + goal-verifier. A code goal cannot shed the code gates by cleaning its tree.
2. Non-code completion still gated — CONFIRMED
The 3 always-on gates (goal-prompt-auditor, goal-reviewer, goal-final-auditor) are never dropped — only CODE_GATES are conditionally filtered (gates.js:88). Probed a non-code goal: zero verdicts → blocked (3 missing); all FAIL → blocked; 2/3 PASS → blocked. So a non-code goal with no real deliverable/evidence cannot complete; the reviewers gate it.
3. Override correctness — CONFIRMED
gates.js:86-87: always→force on, never→force off, anything else→isCodeBearing (auto). Default requireCodeReview:"auto" (config.js). Coerced via coerceStr (config.js:269, string-typed default) — a number/object/null passed programmatically is String()-coerced to an unknown mode and safely falls through to auto; no crash (probed 123, {}, null, "AUTO", "Always" → all keep code gates for a code goal).
4. Reviewer prompts — no rubber-stamping
The prompt-auditor/reviewer/final-auditor/verifier additions only stop treating a clean tree as an automatic FAIL; they still direct the reviewer to judge the deliverable's correctness/completeness against goal_evidence_map and acceptance criteria. The one auto-PASS instruction (goal-diff-reviewer.md:45, PASS on empty git diff) is narrow: an empty working-tree diff was already produced by any committed work pre-PR, and committed code is still substantively reviewed by goal-reviewer ("actual files and behavior", git log access) plus the final-auditor — so it is not a new escape hatch for code goals.
5. Tests — meaningful, not weakened to pass
tests/gates.test.mjs adds real assertions (non-code drops only CODE_GATES but keeps the 3 always-on; editing restores code gates; override matrix; completion-on-always-on). tests/programmatic-review.test.mjs >=5→>=3 is correct: an evidence-only goal legitimately needs only the 3 always-on reviewers, and the test still asserts review runs AND uses a PASS-stubbed reviewer (completion reflects real passes, not an evaporated gate).
6. CI — green
Tests 672/672, lint exit 0 (only pre-existing warnings/infos; the 2 in changed test files are a pre-existing unused-var helper), config validation passes.
Nits (non-blocking): goal-diff-reviewer.md auto-PASS on empty diff is a pre-existing diff-tooling limitation, not introduced here, and is backstopped by the always-on reviewers — fine to leave.
Problem (observed in the field)
Goal Mode failed a non-code agentic goal — producing an explanation of a PR — because its review gates are hard-wired for code. The
goal-prompt-auditorblocked with "No working-tree changes...git diffis empty," and the always-requiredgoal-diff-reviewer/goal-verifiergates can never pass when the deliverable is a text answer + evidence rather than a diff. Every goal was treated as a code-implementation goal.Fix — task-aware gating
gates.js): a goal is "code-bearing" only once it is dirty, edits a file, or records changed files. The code-only gates (goal-diff-reviewer,goal-verifier) are required only for a code-bearing goal. A research / analysis / explanation / planning goal is gated on its recorded evidence by the always-on reviewers (goal-prompt-auditor,goal-reviewer,goal-final-auditor), so an emptygit diffno longer blocks it. An edit instantly restores the code gates (lastEditSeqis monotonic, so edit-then-revert cannot game it).requireCodeReview(auto|always|never, defaultauto) +GOAL_GUARD_REQUIRE_CODE_REVIEWto override the detection; documented inCONFIG_DOCS, thegoal-configtool, and the README.goal.mdmakes non-code goals first-class and tells the agent to record its produced answer as evidence (with the criteria it covers) so coverage is checkable.Verification
node --test "tests/*.test.mjs"→ 672 pass, 0 fail (new tests: code vs non-code gate selection, therequireCodeReviewoverride, and a non-code goal completing on the always-on gates alone; updated two tests that assumed a fresh state needs all five gates).npm run lint,validate-opencode-config.mjs, and thegoal-configno-drift test all green.