Skip to content

feat: Goal Mode works for non-code agentic goals (research, analysis, explanation)#426

Merged
devinoldenburg merged 2 commits into
mainfrom
feat/non-code-agentic-goals
Jun 21, 2026
Merged

feat: Goal Mode works for non-code agentic goals (research, analysis, explanation)#426
devinoldenburg merged 2 commits into
mainfrom
feat/non-code-agentic-goals

Conversation

@devinoldenburg

Copy link
Copy Markdown
Owner

Problem (observed in the field)

Goal Mode failed a non-code agentic goal — producing an explanation of a PR — because its review gates are hard-wired for code. The goal-prompt-auditor blocked with "No working-tree changes... git diff is empty," and the always-required goal-diff-reviewer / goal-verifier gates can never pass when the deliverable is a text answer + evidence rather than a diff. Every goal was treated as a code-implementation goal.

Fix — task-aware gating

  • Gate selection (gates.js): a goal is "code-bearing" only once it is dirty, edits a file, or records changed files. The code-only gates (goal-diff-reviewer, goal-verifier) are required only for a code-bearing goal. A research / analysis / explanation / planning goal is gated on its recorded evidence by the always-on reviewers (goal-prompt-auditor, goal-reviewer, goal-final-auditor), so an empty git diff no longer blocks it. An edit instantly restores the code gates (lastEditSeq is monotonic, so edit-then-revert cannot game it).
  • New config requireCodeReview (auto | always | never, default auto) + GOAL_GUARD_REQUIRE_CODE_REVIEW to override the detection; documented in CONFIG_DOCS, the goal-config tool, and the README.
  • Reviewer prompts: the five reviewers now treat a clean working tree as expected for a non-code goal and evaluate the recorded evidence/answer against the acceptance criteria instead of failing on a missing diff.
  • Goal agent: goal.md makes non-code goals first-class and tells the agent to record its produced answer as evidence (with the criteria it covers) so coverage is checkable.

Verification

  • node --test "tests/*.test.mjs"672 pass, 0 fail (new tests: code vs non-code gate selection, the requireCodeReview override, and a non-code goal completing on the always-on gates alone; updated two tests that assumed a fresh state needs all five gates).
  • npm run lint, validate-opencode-config.mjs, and the goal-config no-drift test all green.

Devin Oldenburg added 2 commits June 21, 2026 15:19
…mplete

The code-only review gates (goal-diff-reviewer, goal-verifier) were always
required, so a research/analysis/explanation/planning goal — which produces a
text + evidence deliverable and no diff — could never satisfy them, and the
reviewers blocked on an empty git diff (clean working tree -> FAIL).

- gates.js: a goal is 'code-bearing' only once it is dirty, edits a file, or
  records changed files; otherwise the diff/verifier gates are dropped and the
  deliverable is gated on evidence by the always-on reviewers (prompt-auditor,
  reviewer, final-auditor).
- New CONFIG/CONFIG_DOCS key requireCodeReview ('auto' | 'always' | 'never',
  default auto) + GOAL_GUARD_REQUIRE_CODE_REVIEW to override the detection.
- Tests for code/non-code gate selection, the override, and a non-code goal
  completing on the always-on gates alone.
- Reviewer prompts (prompt-auditor, reviewer, final-auditor, diff-reviewer,
  verifier): a clean working tree / empty git diff is expected for a non-code
  goal and is not, by itself, a FAIL; evaluate the recorded evidence and the
  delivered answer against the acceptance criteria instead.
- goal.md: non-code goals are first-class — record the produced answer as
  evidence (with the criteria it covers); the code-only gates are not required.
- README + ARCHITECTURE: document non-code agentic support and requireCodeReview.

@devinoldenburg devinoldenburg left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial review: APPROVE (no enforcement weakening found)

Verified by running probes against gates.js/events.js/state.js and the full suite (node --test tests/*.test.mjs → 672 pass; npm run lint exit 0; validate-opencode-config.mjs passes).

1. No bypass on code goals — CONFIRMED

isCodeBearing (gates.js:27) is true if dirty OR lastEditSeq>0 OR changedFiles.length>0. The lastEditSeq>0 term is the lock: markEdit (events.js:79) sets it via store.nextSeq() which is seq+=1 (store.js:136) — strictly monotonic, restored as Math.max across persistence (store.js:202), and never reset. markEdit is the only writer of lastEditSeq and is called on every edit tool / bash mutation / file.edited event.

Edit-then-revert/commit scenario probed directly: after one edit lastEditSeq=1; then forcing dirty=false, changedFiles=[] (simulating a clean tree) → isCodeBearing STAYS true and requiredGates STILL returns goal-diff-reviewer + goal-verifier. A code goal cannot shed the code gates by cleaning its tree.

2. Non-code completion still gated — CONFIRMED

The 3 always-on gates (goal-prompt-auditor, goal-reviewer, goal-final-auditor) are never dropped — only CODE_GATES are conditionally filtered (gates.js:88). Probed a non-code goal: zero verdicts → blocked (3 missing); all FAIL → blocked; 2/3 PASS → blocked. So a non-code goal with no real deliverable/evidence cannot complete; the reviewers gate it.

3. Override correctness — CONFIRMED

gates.js:86-87: always→force on, never→force off, anything else→isCodeBearing (auto). Default requireCodeReview:"auto" (config.js). Coerced via coerceStr (config.js:269, string-typed default) — a number/object/null passed programmatically is String()-coerced to an unknown mode and safely falls through to auto; no crash (probed 123, {}, null, "AUTO", "Always" → all keep code gates for a code goal).

4. Reviewer prompts — no rubber-stamping

The prompt-auditor/reviewer/final-auditor/verifier additions only stop treating a clean tree as an automatic FAIL; they still direct the reviewer to judge the deliverable's correctness/completeness against goal_evidence_map and acceptance criteria. The one auto-PASS instruction (goal-diff-reviewer.md:45, PASS on empty git diff) is narrow: an empty working-tree diff was already produced by any committed work pre-PR, and committed code is still substantively reviewed by goal-reviewer ("actual files and behavior", git log access) plus the final-auditor — so it is not a new escape hatch for code goals.

5. Tests — meaningful, not weakened to pass

tests/gates.test.mjs adds real assertions (non-code drops only CODE_GATES but keeps the 3 always-on; editing restores code gates; override matrix; completion-on-always-on). tests/programmatic-review.test.mjs >=5>=3 is correct: an evidence-only goal legitimately needs only the 3 always-on reviewers, and the test still asserts review runs AND uses a PASS-stubbed reviewer (completion reflects real passes, not an evaporated gate).

6. CI — green

Tests 672/672, lint exit 0 (only pre-existing warnings/infos; the 2 in changed test files are a pre-existing unused-var helper), config validation passes.

Nits (non-blocking): goal-diff-reviewer.md auto-PASS on empty diff is a pre-existing diff-tooling limitation, not introduced here, and is backstopped by the always-on reviewers — fine to leave.

@devinoldenburg devinoldenburg merged commit 63f76fd into main Jun 21, 2026
11 checks passed
@devinoldenburg devinoldenburg deleted the feat/non-code-agentic-goals branch June 21, 2026 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant