From 4c86dc61d0a895ad2b15c9b7bd84ec436ea2a87e Mon Sep 17 00:00:00 2001 From: jfk Date: Wed, 20 May 2026 02:46:02 +0900 Subject: [PATCH] feat(start,ship): token-efficient flags for cascade gating (#64) Add four opt-in flags that adapt the cascade to issue/diff scope: - `auto-size` (/start): skip gate1 entirely for small issues per the label/body-length heuristic in gate1.size_heuristic. Default off. - `auto-skip` (/ship): skip docs-only gate2 advisors per the path-pattern rules in gate2.diff_scope_skip. Default off. - `--with-plan` (/start): invoke /superpowers:writing-plans after gate1 and persist the plan to the state file (step 12a + 14.5). - `--parallel` (/start): chain /superpowers:subagent-driven-development as the step 18 continue target. Requires --with-plan. Backward-compat: with no flags and no config opt-in, behavior is byte-identical to v0.8.0. State schema is additive (`plan`, `gate2.skipped_advisors`, `gate2.diff_scope` are all v2-compatible optional fields). Co-locates measurement procedure (CONTRIBUTING.md, tests/fixtures/typo-fix-issue.md) so future PRs can produce reproducible rtk gain numbers. Co-Authored-By: Claude Opus 4.7 (1M context) --- CONTRIBUTING.md | 27 +++++++ README.ja.md | 42 +++++++++++ README.md | 42 +++++++++++ commands/config.md | 48 +++++++++++- commands/doctor.md | 6 +- commands/ship.md | 62 ++++++++++++++- commands/start.md | 126 +++++++++++++++++++++++++++++-- tests/fixtures/typo-fix-issue.md | 37 +++++++++ 8 files changed, 375 insertions(+), 15 deletions(-) create mode 100644 tests/fixtures/typo-fix-issue.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 547c72d..c3f8e24 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -71,6 +71,33 @@ The reviewer skills (`/claude-c-suite:*`, `/claude-phd-panel:*`) are advisory by If you maintain a c-suite or phd-panel skill, **adding the Verdict line is the single highest-leverage change** for `gh-issue-driven` integration. +## Measuring token consumption (`rtk gain`) + +The token-efficiency flags (`auto-size`, `auto-skip`, `--with-plan`, `--parallel` — see [README](README.md#token-efficient-flags-auto-size-auto-skip---with-plan---parallel)) ship with a measurement obligation: each PR that touches them must capture before/after `rtk gain` numbers in the PR description. The procedure below makes the comparison reproducible. + +### Fixture-based measurement + +A fixed scenario is required so two runs are comparable. The repo ships one at `tests/fixtures/typo-fix-issue.md` — a representative "small issue" body that exercises the docs-only / auto-size paths. Add new fixtures for other scenarios as needed. + +### Procedure + +1. **Baseline** — checkout the version to compare against (e.g. `git checkout v0.8.0`), run `rtk gain` once to record the current totals, then invoke `/gh-issue-driven:start` (or `/ship`) against the fixture and capture `rtk gain --history` for the resulting delta. Save the output. +2. **HEAD** — checkout the PR branch, repeat step 1 with the same fixture. +3. **Diff table** — in the PR description, paste a table like: + + ``` + | Scenario | v0.8.0 baseline | PR HEAD | delta | + |---|---|---|---| + | /start typo-fix-issue --auto-size | | | -N% | + | /ship docs-only-diff --auto-skip | | | -N% | + ``` + +The fixture itself is markdown-only, so the measurement is not affected by network latency, model variance, or environment drift beyond what `rtk gain` already accounts for. + +### When the measurement is not required + +Only the four token-efficiency flags (and their config equivalents) carry this measurement obligation. PRs that don't change cascade gating or skill invocation paths can skip the table — `rtk gain` is for verifying claims about token impact, not a universal PR requirement. + ## Design principles A few load-bearing principles that shape what gets accepted into `commands/`: diff --git a/README.ja.md b/README.ja.md index 0e3453b..6b2048d 100644 --- a/README.ja.md +++ b/README.ja.md @@ -210,6 +210,48 @@ git worktree prune # 古い登録情報の掃除 --- +## Token 効率化フラグ (`auto-size`, `auto-skip`, `--with-plan`, `--parallel`) + +`/start` と `/ship` には、作業の実態に応じて cascade を軽量化する opt-in フラグがあります。フラグ未指定時の挙動は v0.8.0 と byte-identical です。 + +### `auto-size` — small issue で gate1 を完全スキップ (`/start`) + +`/gh-issue-driven:start auto-size` で **issue-size heuristic** を起動。primary issue のラベルが `gate1.size_heuristic.small_labels` (デフォルト: `good first issue`, `documentation`, `docs`, `tests`, `i18n`) に一致するか、本文が `small_body_max_chars` (デフォルト 500 字) 未満なら、gate1 は **完全にスキップ** されます。`/claude-c-suite:ask` も `/ceo` 昇格も走りません。state file には `gate1.reviewer="size-heuristic"` と synthetic verdict markdown が記録されます。 + +HITL 確認ゲートは引き続き発火するので、「I have feedback」でスキップを上書き可能。batch invocation (`/start 4 5 6`) は決して small 扱いされません — bundling 自体が coherence signal だからです。 + +毎回 `/start` でデフォルト ON にしたい場合は `~/.claude/gh-issue-driven-config.json` の `gate1.size_heuristic.enabled: true` に設定。 + +### `auto-skip` — 関係ない gate2 advisor をスキップ (`/ship`) + +`/gh-issue-driven:ship auto-skip` は diff を調べ、変更ファイル**すべて**が `gate2.diff_scope_skip.docs_only_patterns` (デフォルト `README*` / `CHANGELOG*` / `CONTRIBUTING*` / `docs/` / `.github/`) に一致したら、`docs_only_skip_advisors` (デフォルト `cso` と `qa-lead`) をスキップします。残りの advisor (例: `cto`) は走り、binary gate (`gate2.binary_gate`) は影響を受けません。 + +1ファイルでも非doc 変更があれば docs-only 判定は不成立 — partial skip はありません。このリポジトリの `commands/*.md` は markdown-as-code (実行 spec) なので、意図的にデフォルトパターンから除外されています。 + +永続化設定は `gate2.diff_scope_skip.enabled: true`。 + +### `--with-plan` — gate1 後に実装プランを生成 (`/start`) + +`/gh-issue-driven:start --with-plan` で、gate1 verdict 後・branch 作成前に `superpowers:writing-plans` を起動。plan markdown は会話から取得され、`~/.claude/cache/gh-issue-driven/.plan.md` に永続化、state file の `plan.path` が pointer になります。 + +`superpowers` プラグインが必要。未インストール時は warning を出して plan なしで継続 (abort しません)。 + +plan は `/start` 会話の **in-line で生成** されます — `/claude-c-suite:ask` や `/kagura-memory:session-start` と同じ実行モデルで、autonomous な sub-process は存在しません。 + +### `--parallel` — subagent dispatch で実装 (`/start`) + +`/gh-issue-driven:start --with-plan --parallel` で plan 生成後に `superpowers:subagent-driven-development` をチェーン: step 18 の continue target が「plan を入力として subagent dispatch を起動」になります (`/feature-dev:feature-dev` 起動や conversation 内 plan 起案の代わり)。 + +`--parallel` は `--with-plan` を要求します (subagent skill は plan を input として消費する仕様)。`--parallel` 単独は明示エラーで弾きます。 + +### こんなときは使わない + +- **`auto-size` + 複雑 issue**: グローバルに `auto-size` を ON にして heuristic が複雑 issue を small と誤判定した場合、gate1 はそのままスキップされます。HITL gate が safety net なので、override して再実行可能。 +- **`auto-skip` + security-sensitive docs**: `docs/` に脅威モデルなど security review 必須の文書がある場合は default `false` のまま、必要 PR で `cso` を手動で追加。 +- **`--parallel` + 些細な変更**: subagent dispatch には setup コストがあります。1ファイルの fix なら conversation 内 continue target (「plan を起案」) の方が安価。 + +--- + ## Copilot レビューループ `gh pr create` の後、`/gh-issue-driven:ship` はレビュアの add を発火し、polling loop に入る前に **HITL 確認ゲート** で立ち止まります(v0.3.0 以降): diff --git a/README.md b/README.md index f60e178..e0466bc 100644 --- a/README.md +++ b/README.md @@ -211,6 +211,48 @@ If you manually `rm -rf` the directory first (don't), `git worktree prune` alone --- +## Token-efficient flags (`auto-size`, `auto-skip`, `--with-plan`, `--parallel`) + +`/start` and `/ship` accept opt-in flags that adapt the cascade to the actual scope of the work. Defaults are unchanged — when no flags are passed, behavior is byte-identical to v0.8.0. + +### `auto-size` — skip gate1 for small issues (`/start`) + +`/gh-issue-driven:start auto-size` invokes the **issue-size heuristic**: if the primary issue's labels match `gate1.size_heuristic.small_labels` (default `good first issue`, `documentation`, `docs`, `tests`, `i18n`) **or** the body is shorter than `small_body_max_chars` (default 500), gate1 is **skipped entirely**. No `/claude-c-suite:ask` invocation, no escalation. The state file records `gate1.reviewer="size-heuristic"` and a synthetic verdict markdown. + +The HITL confirmation gate still fires so you can override the skip with "I have feedback". Batch invocations (`/start 4 5 6`) are never small — bundling is itself a coherence signal. + +To make this the default for every `/start` run, set `gate1.size_heuristic.enabled: true` in `~/.claude/gh-issue-driven-config.json`. + +### `auto-skip` — skip irrelevant gate2 advisors (`/ship`) + +`/gh-issue-driven:ship auto-skip` probes the diff and, when every changed file matches `gate2.diff_scope_skip.docs_only_patterns` (default `README*`, `CHANGELOG*`, `CONTRIBUTING*`, `docs/`, `.github/`), skips the advisors listed in `docs_only_skip_advisors` (default `cso` and `qa-lead`). The remaining advisors (e.g. `cto`) still run, and the binary gate (`gate2.binary_gate`) is never affected. + +A diff that touches even one non-doc file disqualifies docs-only treatment — there is no partial skip. The `commands/*.md` files in this repo are markdown-as-code and are intentionally **not** in the default pattern list. + +Persistent equivalent: set `gate2.diff_scope_skip.enabled: true` in config. + +### `--with-plan` — generate an implementation plan after gate1 (`/start`) + +`/gh-issue-driven:start --with-plan` invokes `superpowers:writing-plans` after the gate1 verdict and before branch creation. The plan markdown is captured from the conversation and persisted to `~/.claude/cache/gh-issue-driven/.plan.md`; the state file's `plan.path` points to it. + +Requires the `superpowers` plugin. Without it, `/start` logs a warning and continues without a plan (no abort). + +The plan is **produced in-line** as part of the `/start` conversation — the same execution model as `/claude-c-suite:ask` and `/kagura-memory:session-start`. There is no autonomous sub-process. + +### `--parallel` — implementation via subagent dispatch (`/start`) + +`/gh-issue-driven:start --with-plan --parallel` chains `superpowers:subagent-driven-development` after plan generation: step 18's continue target is set to invoke subagent dispatch on the plan content, rather than launching `/feature-dev:feature-dev` or drafting an implementation outline in the conversation. + +`--parallel` requires `--with-plan` (the subagent skill consumes a plan as its input). Setting `--parallel` alone is rejected with an error so the workflow stays explicit. + +### When NOT to use these flags + +- **`auto-size` + complex issue**: if you set `auto-size` globally and the heuristic misclassifies a complex issue as small, gate1 is skipped without ceremony. The HITL gate is your safety net — override the skip and re-run if needed. +- **`auto-skip` + security-sensitive docs**: if your `docs/` directory contains published threat models or anything that warrants security review on edit, keep the default `false` and re-add `cso` manually for those PRs. +- **`--parallel` for trivial changes**: subagent dispatch has its own setup cost. For one-file fixes, the in-conversation continue target ("draft a plan now") is cheaper. + +--- + ## Copilot review loop After `gh pr create`, `/gh-issue-driven:ship` fires the reviewer add and pauses at a **HITL confirmation gate** (since v0.3.0) before entering the polling loop: diff --git a/commands/config.md b/commands/config.md index 40bb550..3ce4837 100644 --- a/commands/config.md +++ b/commands/config.md @@ -98,6 +98,40 @@ To trust a fork, change the URL: `"claude-c-suite": "https://github.com/yourfork **CI use**: run `doctor verbose` and `grep '^PLUGIN_CHECK'` to parse machine-readable `key=value` lines. Fail on `status=unexpected` to enforce origin pinning in CI. +### `gate1.size_heuristic` + +Opt-in mechanism that lets `/gh-issue-driven:start` skip the gate1 cascade entirely for issues that match a "small issue" heuristic. Default is **off** for backward compatibility — when disabled, `/start` runs gate1 (`/claude-c-suite:ask` → optional `/ceo` escalation) on every issue, exactly as in v0.8.0. + +| Key | Default | Meaning | +|---|---|---| +| `enabled` | `false` | Master switch. When `true`, the heuristic runs in `/start` step 7a. The CLI flag `auto-size` is the per-invocation equivalent. | +| `small_labels` | `["good first issue", "documentation", "docs", "tests", "i18n"]` | Case-insensitive label list. If the primary issue has any label in this list, the issue is small. | +| `small_body_max_chars` | `500` | If the primary issue body is shorter than this, the issue is small. | + +**Trigger logic**: an issue is small when **either** signal fires — label match OR short body. Both proxies capture the same "implementer can act without design review" intuition; the OR captures both labeled triage signals and short-body brevity signals. + +**Effect when small**: gate1 (`/claude-c-suite:ask`, optional `/ceo` escalation) is **not** invoked. `GATE1_REVIEWER` is set to the sentinel `"size-heuristic"`, `GATE1_VERDICT` to `green`, and the gate1 markdown captures a synthetic "skipped by policy" record. The HITL gate (`gate1.green_continue_requires_confirm`) still fires so the operator can override the size-heuristic decision by selecting "I have feedback". + +**Batch mode**: `IS_BATCH=true` invocations are **never** small. Bundling multiple issues is itself a coherence signal that warrants gate1 review. + +**Token savings**: the saved cost is one `/claude-c-suite:ask` invocation per matching issue (plus the rarer `/ceo` escalation when the verdict is decline). For a docs-only typo fix, this is the dominant cost in `/start` — order-of-magnitude reduction is realistic. + +### `gate2.diff_scope_skip` + +Opt-in mechanism that lets `/gh-issue-driven:ship` skip irrelevant gate2 advisors based on the changed-file scope. Default is **off** for backward compatibility — when disabled, `/ship` runs every advisor in `gate2.advisors`, exactly as in v0.8.0. + +| Key | Default | Meaning | +|---|---|---| +| `enabled` | `false` | Master switch. When `true`, ship.md step 4a probes the diff and may filter `ADVISORS`. The CLI flag `auto-skip` is the per-invocation equivalent. | +| `docs_only_patterns` | `["^README", "^CHANGELOG", "^CONTRIBUTING", "^docs/", "^\\.github/"]` | Regex list. A diff is **docs-only** when **every** changed file matches at least one pattern. | +| `docs_only_skip_advisors` | `["/claude-c-suite:cso", "/claude-c-suite:qa-lead"]` | Advisors skipped when the diff is docs-only. The remaining advisors (e.g. `cto`) still run. | + +**Important**: this plugin's `commands/*.md` files are markdown-as-code (the runtime spec) — they are intentionally **not** in the default `docs_only_patterns`. Editing `commands/start.md` is a behavioral change, not docs. If you customize the patterns, keep this invariant. + +**Effect when docs-only**: the matched advisors are excluded from the parallel battery in ship.md step 6. The binary gate (`gate2.binary_gate`) is never affected — it has its own opt-in via the `binary_gate` key. `SKIPPED_ADVISORS` is recorded in the gate2 state block (`gate2.skipped_advisors`) so `/gh-issue-driven:status` can surface what was skipped and why. + +**When NOT to enable**: if your repo treats `docs/` as security-sensitive (e.g. published threat models, secret-management docs that could leak credentials when edited), keep the default `false`. Skipping `cso` is appropriate only when docs changes carry no security review obligation. + ### `memory.context_id` Accepts three forms — in priority order of recommendation: @@ -167,7 +201,12 @@ Users without kagura-memory installed can ignore this field — recall is skippe "primary": "/claude-c-suite:ask", "fallback": "/claude-c-suite:ceo", "yellow_continue_requires_confirm": true, - "green_continue_requires_confirm": true + "green_continue_requires_confirm": true, + "size_heuristic": { + "enabled": false, + "small_labels": ["good first issue", "documentation", "docs", "tests", "i18n"], + "small_body_max_chars": 500 + } }, "gate2": { "binary_gate": null, @@ -178,7 +217,12 @@ Users without kagura-memory installed can ignore this field — recall is skippe ], "yellow_continue_requires_confirm": true, "green_continue_requires_confirm": true, - "run_tests_before_gate2": false + "run_tests_before_gate2": false, + "diff_scope_skip": { + "enabled": false, + "docs_only_patterns": ["^README", "^CHANGELOG", "^CONTRIBUTING", "^docs/", "^\\.github/"], + "docs_only_skip_advisors": ["/claude-c-suite:cso", "/claude-c-suite:qa-lead"] + } }, "review": { "provider": "copilot" diff --git a/commands/doctor.md b/commands/doctor.md index 946bd20..e4d6220 100644 --- a/commands/doctor.md +++ b/commands/doctor.md @@ -361,13 +361,13 @@ CI scripts can parse with `grep '^PLUGIN_CHECK'` and fail on `status=unexpected` ``` (Available from the Claude Code official marketplace — no marketplace add needed.) -11. **Worktree plugin: `superpowers`** (optional — enhances `/gh-issue-driven:start --worktree` with smart directory selection) - - Probe via plugin cache glob: `~/.claude/plugins/cache/superpowers*`. `start.md` step 13b uses the **identical glob probe** (`ls -d ~/.claude/plugins/cache/superpowers*`) so the two commands answer "is superpowers installed?" the same way — they MUST NOT drift. When changing the glob here, update `commands/start.md` step 13b in the same commit. +11. **Worktree plugin: `superpowers`** (optional — enhances `/gh-issue-driven:start --worktree`, `--with-plan`, and `--parallel` with smart directory selection, plan generation, and subagent-driven implementation respectively) + - Probe via plugin cache glob: `~/.claude/plugins/cache/superpowers*`. `start.md` step 13b uses the **identical glob probe** (`ls -d ~/.claude/plugins/cache/superpowers*`) so the two commands answer "is superpowers installed?" the same way — they MUST NOT drift. When changing the glob here, update `commands/start.md` step 13b in the same commit. Steps 12a (`--with-plan` → `/superpowers:writing-plans`) and 18b (`--parallel` → `/superpowers:subagent-driven-development`) rely on the same plugin; this single probe answers for all three flags. - Run the Plugin Metadata Resolution Procedure with: - `PMRP_GLOB=superpowers*` - `PMRP_SKILL=superpowers` - `PMRP_OFFICIAL=false` - - If `PLUGIN_FOUND=false`: emit ⚠️ `superpowers: not installed — /gh-issue-driven:start --worktree will fall back to direct git worktree add .worktrees/, which still works (no hard requirement)`. + - If `PLUGIN_FOUND=false`: emit ⚠️ `superpowers: not installed — /gh-issue-driven:start --worktree will fall back to direct git worktree add .worktrees/ (still works); --with-plan and --parallel will degrade with a warning and continue without invoking the missing skills`. - Otherwise: emit the status line per the procedure's output format. - When `fix` flag is set AND missing, append a 2-line `try:` block: ``` diff --git a/commands/ship.md b/commands/ship.md index 8f0704b..7243ff4 100644 --- a/commands/ship.md +++ b/commands/ship.md @@ -2,7 +2,7 @@ description: Phase 2 of gh-issue-driven — runs gate2 (audit + cso + qa-lead + cto in parallel), creates the PR, drives a Copilot review loop up to 5 iterations, and saves session knowledge to Kagura Memory. arguments: - name: flags - description: "Optional space-separated flags: 'dry-run' (skip push/PR/loop), 'force' (bypass red advisor verdicts — does NOT bypass audit fail), 'no-copilot' (skip the post-PR review entirely — legacy alias for review.provider=none), 'draft' (open the PR as draft), 'resume' (skip steps 3-12, jump to review on an already-open PR)." + description: "Optional space-separated flags: 'dry-run' (skip push/PR/loop), 'force' (bypass red advisor verdicts — does NOT bypass audit fail), 'no-copilot' (skip the post-PR review entirely — legacy alias for review.provider=none), 'draft' (open the PR as draft), 'resume' (skip steps 3-12, jump to review on an already-open PR), 'auto-skip' (skip gate2 advisors that don't apply to the diff scope — see `gate2.diff_scope_skip` config)." required: false --- @@ -126,7 +126,9 @@ This ensures all downstream steps can use `state.issues` uniformly regardless of ### 2. Load configuration and parse flags -Load `~/.claude/gh-issue-driven-config.json` over the defaults documented in `/gh-issue-driven:config`. Parse `$ARGUMENTS` into `DRY_RUN`, `FORCE`, `NO_COPILOT`, `DRAFT`, `RESUME` booleans. Reject unknown flags. +Load `~/.claude/gh-issue-driven-config.json` over the defaults documented in `/gh-issue-driven:config`. Parse `$ARGUMENTS` into `DRY_RUN`, `FORCE`, `NO_COPILOT`, `DRAFT`, `RESUME`, `AUTO_SKIP` booleans. Reject unknown flags. + +`AUTO_SKIP` opts in to gate2 diff-scope skipping for this invocation only. The config key `gate2.diff_scope_skip.enabled` (default `false`) is the persistent equivalent; the flag overrides the config to `true` for one run. Backward-compat: when neither the flag nor the config enables it, gate2 behavior is byte-identical to v0.8.0. Determine `REVIEW_PROVIDER` as follows: if the **user config** explicitly sets `review.provider`, use that value. Otherwise, for backward compatibility with v0.1.x configs, check legacy `copilot.enabled`: if the user config explicitly sets `copilot.enabled` to `false`, set `REVIEW_PROVIDER="none"`; otherwise default to `"copilot"`. Valid values: `copilot`, `code-review`, `both`, `none`. If `NO_COPILOT` is set, override `REVIEW_PROVIDER` to `"none"` for this invocation (backward compatibility). @@ -184,6 +186,56 @@ If tests fail, abort gate2 with the failure output. Default is **off** — gate2 is reviews, not test execution. +### 4a. Compute diff scope and optionally filter advisors + +This sub-step runs only when **either** `AUTO_SKIP` is true OR `gate2.diff_scope_skip.enabled` in the effective config is `true`. Otherwise skip entirely — `SKIPPED_ADVISORS=[]` and the full `ADVISORS` list from config is used in step 6 as before. + +When enabled, derive `CHANGED_FILES` from the diff: + +```bash +git diff --name-only "origin/$DEFAULT_BRANCH...HEAD" > /tmp/gh-issue-driven.changed-files +``` + +Read the config patterns: +- `DOCS_PATTERNS = gate2.diff_scope_skip.docs_only_patterns` (default `["^README", "^CHANGELOG", "^CONTRIBUTING", "^docs/", "^\\.github/"]`) +- `DOCS_SKIP_ADVISORS = gate2.diff_scope_skip.docs_only_skip_advisors` (default `["/claude-c-suite:cso", "/claude-c-suite:qa-lead"]`) + +Classification rule: a diff is **docs-only** when **every** path in `CHANGED_FILES` matches at least one pattern in `DOCS_PATTERNS`. A single non-matching path disqualifies the diff from docs-only treatment — there is no partial skip. + +The pattern set is intentionally narrow. Spec/command files under `commands/*.md`, `references/*.md`, or `tests/` are **not** docs-only in this plugin: those are markdown-as-code (`commands/*.md` IS the runtime spec). The defaults reflect this. + +```bash +DOCS_ONLY=true +while IFS= read -r f; do + [ -z "$f" ] && continue + match=false + for pat in "${DOCS_PATTERNS[@]}"; do + if echo "$f" | grep -qE "$pat"; then + match=true + break + fi + done + if [ "$match" = "false" ]; then + DOCS_ONLY=false + break + fi +done < /tmp/gh-issue-driven.changed-files +``` + +If `DOCS_ONLY=true`: set `SKIPPED_ADVISORS = DOCS_SKIP_ADVISORS` and filter the in-session `ADVISORS` list (read from config in step 6) to exclude every entry in `SKIPPED_ADVISORS`. Log a single one-line note: + +``` +gate2.diff_scope_skip: docs-only diff detected ( files all match patterns); skipping advisors +``` + +If `DOCS_ONLY=false`: set `SKIPPED_ADVISORS=[]`. No advisors are filtered. Log: + +``` +gate2.diff_scope_skip: diff is not docs-only ( changed files); running full advisor list +``` + +`SKIPPED_ADVISORS` is recorded in the state file (step 9) so the recap and PR body can surface what was skipped and why. The binary gate (`gate2.binary_gate`) is **never** skipped by this mechanism — only advisors. Binary gate skipping is governed by `gate2.binary_gate=null` (see config.md). + ### 5. Build the gate2 prompt block Construct one shared block all four reviewers will receive: @@ -220,7 +272,7 @@ the last `## Verdict:` line is what counts. Read `gate2.binary_gate` and `gate2.advisors` from the effective config. -First, read the advisor list from config: `ADVISORS = gate2.advisors` (from the effective config; the default list is `["/claude-c-suite:cso", "/claude-c-suite:qa-lead", "/claude-c-suite:cto"]` per config.md, but the user can override). Iterate `ADVISORS` to invoke; do **not** hardcode the default skill names in this section — operators who customize `gate2.advisors` must see their custom list invoked, not the defaults. +First, read the advisor list from config: `ADVISORS = gate2.advisors` (from the effective config; the default list is `["/claude-c-suite:cso", "/claude-c-suite:qa-lead", "/claude-c-suite:cto"]` per config.md, but the user can override). **If step 4a filtered `ADVISORS` (because `AUTO_SKIP` or `gate2.diff_scope_skip.enabled` matched a docs-only diff), use the post-filter list — `SKIPPED_ADVISORS` entries are NOT invoked.** Iterate `ADVISORS` to invoke; do **not** hardcode the default skill names in this section — operators who customize `gate2.advisors` must see their custom list invoked, not the defaults. **If `gate2.binary_gate` is `null`** (the v0.1.1 default — see config.md `gate2.binary_gate` notes for the rationale): gate2 runs in **advisor-only mode**. Skip the binary gate slot entirely. Set `AUDIT_OUT = null` (will become `AUDIT_VERDICT = "skipped"` in step 7). Invoke ONLY the advisors from `ADVISORS`: @@ -368,6 +420,8 @@ Update the state file: "": "", "": "" }, + "skipped_advisors": ["", ...], + "diff_scope": "docs-only | mixed | null", "verdict": "", "summary_path": "~/.claude/cache/gh-issue-driven/.gate2.md", "ran_at": "" @@ -375,6 +429,8 @@ Update the state file: "phase": "gated" ``` +`skipped_advisors` is the list set by step 4a (empty `[]` when the diff-scope skip mechanism was disabled or did not match; the configured `docs_only_skip_advisors` list when it did match). `diff_scope` records which classification 4a produced (`"docs-only"` when all changed paths matched docs patterns, `"mixed"` when at least one did not, `null` when the step was skipped entirely because diff-scope skipping was not enabled). Both fields are read by `/gh-issue-driven:status` to render an accurate gate2 summary post-hoc. + The `advisor_verdicts` map is keyed by the **full config string** (e.g. `"/claude-c-suite:cso"`, `"/claude-c-suite:qa-lead"`, `"/claude-c-suite:cto"` for the default config; any custom advisor strings for non-default configs). Using the full config string avoids key collisions between advisors from different namespaces. This replaces the v1 schema's hardcoded `cso`/`qa_lead`/`cto` fields. **Backward compatibility**: state files written by v0.1.0–v0.1.1 (v1 schema) have `gate2.cso`, `gate2.qa_lead`, `gate2.cto` as named fields instead of `advisor_verdicts`. Readers (`/gh-issue-driven:status`) must check for `advisor_verdicts` first; if absent, fall back to reading the v1 named fields and synthesizing the equivalent map: `{"cso": gate2.cso, "qa-lead": gate2.qa_lead, "cto": gate2.cto}`. See `commands/status.md` step 3 for the reader logic. diff --git a/commands/start.md b/commands/start.md index ae22f8b..b6ce372 100644 --- a/commands/start.md +++ b/commands/start.md @@ -5,7 +5,7 @@ arguments: description: "One or more GitHub issue identifiers (number, full URL, or owner/repo#number). Multiple IDs create a batch branch. Required (at least one)." required: true - name: flags - description: "Optional space-separated flags: 'dry-run' (skip branch creation, run gate1 only), 'force' (continue past a red gate1 verdict), 'no-memory' (skip Kagura Memory recall and session-start), '--worktree' (create an isolated git worktree at .worktrees/ instead of checking out in-place — delegates to superpowers:using-git-worktrees when installed), '--branch=' (override the derived branch name; combines with --worktree to place the worktree at .worktrees/)." + description: "Optional space-separated flags: 'dry-run' (skip branch creation, run gate1 only), 'force' (continue past a red gate1 verdict), 'no-memory' (skip Kagura Memory recall and session-start), '--worktree' (create an isolated git worktree at .worktrees/ instead of checking out in-place — delegates to superpowers:using-git-worktrees when installed), '--branch=' (override the derived branch name; combines with --worktree to place the worktree at .worktrees/), 'auto-size' (skip gate1 entirely for issues that match the small-issue heuristic — see `gate1.size_heuristic` config), '--with-plan' (after gate1 verdict and before branch creation, invoke superpowers:writing-plans to generate an implementation plan persisted to the state file), '--parallel' (after the plan is generated, set the step 18 continue target to superpowers:subagent-driven-development for plan-driven parallel implementation; requires --with-plan)." required: false --- @@ -55,7 +55,7 @@ Iterate over `$ARGUMENTS` tokens left-to-right. Each token is classified by **po - URL form: `^https://github\.com/.+/.+/issues/[0-9]+$` - Short form: `^[^/]+/[^#]+#[0-9]+$` - **`--branch=` flag** — matches `^--branch=.+$`. Extract the value after `=` as `BRANCH_OVERRIDE`. -- **Known flag** — one of: `dry-run`, `force`, `no-memory`, `--worktree` +- **Known flag** — one of: `dry-run`, `force`, `no-memory`, `--worktree`, `auto-size`, `--with-plan`, `--parallel` - **Unknown** — reject with a clear error listing valid flags and the multi-issue syntax. All leading tokens that match the issue identifier pattern are collected into the `ISSUE_IDS` list (preserving order). The first token that does NOT match an issue identifier pattern marks the boundary — all remaining tokens are parsed as flags. @@ -80,8 +80,17 @@ Set booleans from flag tokens: - `FORCE=true` if `force` is present - `NO_MEMORY=true` if `no-memory` is present - `WORKTREE=true` if `--worktree` is present (default: `false`) + - `AUTO_SIZE=true` if `auto-size` is present (default: `false`) + - `WITH_PLAN=true` if `--with-plan` is present (default: `false`) + - `PARALLEL=true` if `--parallel` is present (default: `false`) - `BRANCH_OVERRIDE` from `--branch=` if present (default: `null`) +Flag interaction rules (validated here, before any downstream step reads them): + +- `--parallel` requires `--with-plan`. If `PARALLEL=true` and `WITH_PLAN=false`, abort with `error: --parallel requires --with-plan (subagent-driven-development consumes a plan as its input)`. Auto-enabling `--with-plan` was considered and rejected — operators who set `--parallel` without `--with-plan` likely misunderstand the workflow, and silently enabling another flag would mask that. +- `auto-size` is per-invocation and overrides `gate1.size_heuristic.enabled` to `true` for this run only. The config key is the persistent equivalent; the flag is the one-off opt-in. +- `auto-size` and `dry-run` compose normally — when both are set, the size heuristic runs and may short-circuit gate1, but no branch is created. + #### 1a. Validate parsed arguments against allow-list Before any parsed argument touches a bash interpolation, validate each component against a strict regex allow-list. Abort immediately on the first mismatch — do not attempt to sanitize or auto-quote. @@ -397,8 +406,48 @@ Capture results as a list of `{summary, score}` pairs. The skip path at the top of this step already covered the "context_id couldn't be resolved" case (steps 2a/2b set `null` and step 7 skips entirely without calling recall), so by the time the recall call runs, `context_id` is a valid UUID. Any runtime error here is a true network/server issue, not a configuration problem — and the user's `skip_on_failure` choice is the right signal for how to handle it. +### 7a. Issue-size heuristic (optional gate1 short-circuit) + +This sub-step runs only when **either** `AUTO_SIZE` (CLI flag) is true OR `gate1.size_heuristic.enabled` in the effective config is `true`. Otherwise skip entirely — set `SIZE_HEURISTIC_SKIPPED=false` and proceed to step 8 normally. + +When enabled, classify the **primary issue** (`PRIMARY_ISSUE`, set in step 5) as `small` or `not_small`. Batch mode (`IS_BATCH=true`) is **never** small — bundling multiple issues is a coherence signal that warrants gate1 review. In batch mode, set `SIZE_HEURISTIC_SKIPPED=false` and proceed to step 8. + +For single-issue mode, read config: +- `SMALL_LABELS = gate1.size_heuristic.small_labels` (default `["good first issue", "documentation", "docs", "tests", "i18n"]`) +- `SMALL_BODY_MAX = gate1.size_heuristic.small_body_max_chars` (default `500`) + +Both signals are checked. The issue is **small** when **either**: +- Any label name (case-insensitive) is in `SMALL_LABELS`, OR +- `len(PRIMARY_ISSUE.body)` is less than `SMALL_BODY_MAX` + +The OR (not AND) is intentional: a labeled "good first issue" with a longer body is still small; a short body without a small label is still small. Both signals capture the same "implementer can act without deep design input" intuition through different proxies. + +When the issue is **small**: + +- Set `SIZE_HEURISTIC_SKIPPED=true` +- Set `GATE1_VERDICT="green"` (the structural verdict — the issue is treated as design-approved by policy, not by reviewer) +- Set `GATE1_REVIEWER="size-heuristic"` (a sentinel value distinct from `ask`/`ceo`; readers must accept it as a valid `gate1.reviewer` value going forward) +- Set `GATE1_ESCALATED_TO=null` +- Set `GATE1_OUTPUT` to a one-paragraph synthetic markdown describing why the skip fired: + ``` + Gate1 was skipped by the issue-size heuristic (gate1.size_heuristic). + Trigger: , body length chars (<= ). + No design review was performed. The implementer is responsible for surfacing + any design risks during implementation or in the PR description. + ``` +- Set `GATE1_KEY_SUGGESTIONS=[]` (no suggestions to surface) +- Log one line: `gate1.size_heuristic: small issue (labels=[...], body= chars); skipping gate1 cascade` + +Then **skip steps 8 through 11 entirely** and proceed to step 11a's HITL flow. The HITL gate still fires (when `green_continue_requires_confirm=true`) so the operator sees the size-heuristic decision and can override it by selecting "I have feedback" → ask to re-run with the cascade. + +When the issue is **not small**: set `SIZE_HEURISTIC_SKIPPED=false` and proceed to step 8. + +This sub-step is the only entry point for `GATE1_REVIEWER="size-heuristic"`. The value is written to the state file via the normal `gate1` block in step 14; no schema change is required. + ### 8. Build the gate1 prompt block +**Skip steps 8 through 11 entirely when `SIZE_HEURISTIC_SKIPPED=true`** (set in step 7a) — `GATE1_OUTPUT`, `GATE1_REVIEWER`, `GATE1_VERDICT`, and `GATE1_KEY_SUGGESTIONS` are already populated. Proceed directly to step 11a. + #### 8a. Sanitize external text Before interpolating the issue body into the reviewer prompt, run it through this sanitizer. This is the **canonical sanitizer definition** — `ship.md` step 14.d references the same algorithm for PR comment bodies. @@ -598,10 +647,45 @@ This sub-step also runs when `DRY_RUN` is `true` — the operator still sees the ### 12. Verdict handling -- **green** → continue silently to step 13. (HITL was presented in step 11a if `gate1.green_continue_requires_confirm` is `true`.) +- **green** → continue silently to step 12a (or directly to step 13 if `WITH_PLAN=false`). (HITL was presented in step 11a if `gate1.green_continue_requires_confirm` is `true`.) - **yellow** → print the gate1 summary and ask the user via the AskUserQuestion tool: "Gate1 returned yellow. Continue with branch creation?" with options "Yes, continue" / "No, abort". On abort, exit cleanly with state `phase=started, gate1.verdict=yellow` (no branch created). - **red** → if `FORCE` is true, log a loud warning and continue. Otherwise abort with the reviewer's findings printed in full. Suggest "rerun with `force` flag once you have addressed the concerns". +### 12a. Generate the implementation plan (optional, `--with-plan`) + +Skip this step entirely when `WITH_PLAN=false`. Skip also when `DRY_RUN=true` — there is no branch to associate the plan with, and persisting a plan without a branch creates orphan state. + +When `WITH_PLAN=true` and not `DRY_RUN`: + +> **Invoke the `/superpowers:writing-plans` skill via the Skill tool**, passing this prompt block as input. Wait for the full markdown response before continuing. + +``` +# Implementation plan — issue # + +## Issue + +<url> + +## Gate1 outcome +Verdict: <GATE1_VERDICT> (via <GATE1_REVIEWER>[, escalated to <GATE1_ESCALATED_TO>]) + +## Gate1 key suggestions +<if GATE1_KEY_SUGGESTIONS non-empty, render as a bulleted list — translate each item from English back to the operator's lang if needed for the planning conversation> + +## Your task +Produce an implementation plan that addresses the issue's acceptance criteria +and incorporates the gate1 suggestions above. Follow the writing-plans skill +conventions verbatim. The plan you produce in this conversation will be +captured by the parent /start command and persisted to the start state file +in step 14 — your output IS the durable artifact. +``` + +**Execution model — this is non-obvious and easy to misread**: the Skill tool injects the `writing-plans` skill body into Claude's context. Claude (the same conversation that is executing `/start`) then produces the plan **in-line** as the writing-plans-instructed role. There is no autonomous sub-process. After the skill returns and the plan content appears in the conversation, **the parent `/start` command captures the plan content** and writes it to disk in step 14.5 (after branch creation and state file write, before the recap). This is the same in-line execution model that `/claude-c-suite:ask` and `/kagura-memory:session-start` use elsewhere in this command. + +The "capture" mechanism is straightforward: the writing-plans skill produces a markdown document as its final output. The parent reads that document from the conversation history and persists it verbatim. No structured-extraction step is required. + +Capture the produced plan content as `PLAN_OUTPUT` (the verbatim markdown). If the skill is not installed, log a loud warning `--with-plan requested but /superpowers:writing-plans not installed; continuing without a plan` and proceed to step 13 with `PLAN_OUTPUT=null`. Do not abort — the operator can re-run the planning skill later if needed. + ### 13. Create the feature branch Skip this step entirely if `DRY_RUN` is true. @@ -752,6 +836,7 @@ Write `~/.claude/cache/gh-issue-driven/<branch>.json` using a temp file + atomic "summary_path": "~/.claude/cache/gh-issue-driven/<branch-flat>.gate1.md", "ran_at": "<UTC ISO-8601>" }, + "plan": <plan block or null>, "gate2": null, "pr": null, "copilot": null, @@ -759,21 +844,43 @@ Write `~/.claude/cache/gh-issue-driven/<branch>.json` using a temp file + atomic } ``` +The `plan` block has this shape when `WITH_PLAN=true` and the plan was produced successfully: + +```json +"plan": { + "path": "~/.claude/cache/gh-issue-driven/<branch-flat>.plan.md", + "generator": "/superpowers:writing-plans", + "generated_at": "<UTC ISO-8601>" +} +``` + +Otherwise `plan` is JSON `null`. The plan markdown itself is written in step 14.5 (below) — this state field is just a pointer. + Schema notes: - **`issues` array**: contains all issues in input order. Each entry has `number`, `title`, `url`, and `labels`. - **`issue_number`, `issue_title`, `issue_url`**: v1-compatible aliases pointing to `issues[0]` (the primary issue). These fields ensure that `ship.md`, `status.md`, and any v1-era state readers continue to work without modification until they adopt the `issues` array. - **`is_batch`**: `true` when `len(issues) > 1`, `false` otherwise. Allows downstream commands to branch on batch mode without counting the array. - **`worktree_path`**: set from step 13b when `--worktree` was used (either the path superpowers returned, or the `.worktrees/<branch>` fallback). Serialized as a JSON string when populated, or the unquoted JSON literal `null` when `--worktree` was not used — matching the convention of other nullable fields like `gate1.escalated_to`. Readers can use this to render the `cd` hint in `/gh-issue-driven:status` or post-mortem output without re-deriving the path. -- **v1 state files** (created before this change) remain valid — they have `issue_number`/`issue_title` but no `issues` array and no `worktree_path`. Readers should check for `issues` first and fall back to the top-level aliases; absent `worktree_path` is equivalent to `null`. +- **`plan`**: set when `--with-plan` was used and the writing-plans skill produced output (step 14.5). Serialized as a JSON object `{path, generator, generated_at}` when populated, or the unquoted JSON literal `null` otherwise. Schema-additive — readers that do not yet know about `plan` ignore it without error. +- **`gate1.reviewer`** accepts a new sentinel value `"size-heuristic"` (added with `gate1.size_heuristic`) in addition to the existing `"ask"` / `"ceo"`. Readers should treat unknown reviewer values as opaque labels (display verbatim) rather than rejecting them — future heuristics or third-party gate1 implementations may introduce additional sentinels. +- **v1 state files** (created before this change) remain valid — they have `issue_number`/`issue_title` but no `issues` array and no `worktree_path` / `plan`. Readers should check for `issues` first and fall back to the top-level aliases; absent `worktree_path` is equivalent to `null`; absent `plan` is equivalent to `null`. `<branch-flat>` is the branch name with `/` replaced by `-` so it works as a filename. If `DRY_RUN`, do not write the state file (so `/gh-issue-driven:status` won't see a phantom entry). +### 14.5. Persist the implementation plan (only when `WITH_PLAN=true`) + +Skip when `WITH_PLAN=false`, when `DRY_RUN=true`, or when `PLAN_OUTPUT` is null (the writing-plans skill was unavailable or returned nothing usable). In those cases the state file's `plan` field is `null` and there is no markdown to write. + +Otherwise, write `PLAN_OUTPUT` verbatim to `~/.claude/cache/gh-issue-driven/<branch-flat>.plan.md` using a temp file + atomic mv. Use the same `chmod 0700`-protected cache directory established in step 14. The path matches the `plan.path` field in the state file written in step 14. + ### 15. Save the gate1 markdown Write `GATE1_OUTPUT` verbatim to `~/.claude/cache/gh-issue-driven/<branch-flat>.gate1.md` (this is plugin cache, allowed). Skip in `DRY_RUN`. +When `SIZE_HEURISTIC_SKIPPED=true`, the `GATE1_OUTPUT` value is the synthetic "skipped" markdown from step 7a — write it as-is so post-hoc readers (`/gh-issue-driven:status`, audit trails) see exactly what was recorded as the gate1 decision. + ### 16. Print the recap Output exactly one block in this format: @@ -789,6 +896,10 @@ Worktree <WORKTREE_PATH> → cd <WORKTREE_PATH> to work on this branch </if> Gate1 <verdict> (via /<reviewer>[, escalated to /ceo]) + — or — skipped (size-heuristic: <reason>) ← when SIZE_HEURISTIC_SKIPPED +<if WITH_PLAN=true AND PLAN_OUTPUT non-null:> +Plan <~/.claude/cache/gh-issue-driven/<branch-flat>.plan.md> (via /superpowers:writing-plans) +</if> Memory <k> related contexts found (top: "<top summary>" score <score>) — or — kagura-memory not installed; skipped — or — recall returned no results @@ -879,9 +990,10 @@ If `GATE1_VERDICT` is `unknown` (reviewer skills missing), step 18 still fires #### 18b. Pick the "continue" target -Determine what the **continue** option will actually do, based on skill detection from step 17b and on `IS_BATCH`: +Determine what the **continue** option will actually do, based on flags, skill detection from step 17b, and `IS_BATCH`: -- If `IS_BATCH` is true → continue target is **"draft a per-issue implementation plan in this conversation"**. Do **not** auto-launch `/feature-dev` for batch mode (it is single-feature oriented). +- If `PARALLEL=true` (the operator opted in via `--parallel`) AND `PLAN_OUTPUT` is non-null (the writing-plans step produced a usable plan) → continue target is **"invoke `/superpowers:subagent-driven-development` via the Skill tool, passing the plan from step 12a as its input"**. This branch is only reachable when `WITH_PLAN=true` as well (step 1's interaction rules guarantee it). If the `/superpowers:subagent-driven-development` skill is not installed OR `PLAN_OUTPUT` is null (writing-plans skill was unavailable or returned nothing usable), fall through to the next applicable branch and log a one-line warning naming the reason. +- Else if `IS_BATCH` is true → continue target is **"draft a per-issue implementation plan in this conversation"**. Do **not** auto-launch `/feature-dev` for batch mode (it is single-feature oriented). - Else if `/feature-dev:feature-dev` was detected in step 17b → continue target is **"invoke `/feature-dev:feature-dev` via the Skill tool"**. - Else → continue target is **"draft an implementation plan in this conversation, grounded in the issue body and gate1 suggestions"**. @@ -926,7 +1038,7 @@ Invoke the AskUserQuestion tool with this question and these three **fixed** opt #### 18e. Handle the response - **Stop here** → print a one-line acknowledgement (`OK — returning to prompt. Run /gh-issue-driven:ship when implementation is ready.`) and stop. Equivalent to the legacy behavior. -- **Continue (option 2)** → print a one-line acknowledgement naming the action, then immediately perform `CONTINUE_TARGET_ACTION`. For the `/feature-dev` case this means invoking the `/feature-dev:feature-dev` skill via the Skill tool. For the "draft a plan" case, begin a normal conversational turn that summarizes the issue, lists the gate1 key suggestions extracted in step 17a, and proposes a concrete implementation outline grounded in files you have read or will read. +- **Continue (option 2)** → print a one-line acknowledgement naming the action, then immediately perform `CONTINUE_TARGET_ACTION`. For the `--parallel` case, this means invoking the `/superpowers:subagent-driven-development` skill via the Skill tool with the plan content (from step 14.5's `plan.path`) as its working input. For the `/feature-dev` case this means invoking the `/feature-dev:feature-dev` skill via the Skill tool. For the "draft a plan" case, begin a normal conversational turn that summarizes the issue, lists the gate1 key suggestions extracted in step 17a, and proposes a concrete implementation outline grounded in files you have read or will read. - **Feedback / different direction (option 3)** → print a one-line acknowledgement that invites the operator to type their note, e.g. `Got it — what would you like to change or discuss?`. Then **stop and wait** for the operator's next message. When that next message arrives, treat it as the feedback and respond to it conversationally. Do **not** invoke any skill. Do not assume the feedback overrides gate1 — if it implies a design change large enough to invalidate gate1, say so explicitly and suggest re-running `/gh-issue-driven:start` once the new direction is settled. After step 18 completes (regardless of which branch), `/start` is done. The state file written in step 14 is the source of truth for `/ship` and `/status`; step 18's choice is **not** persisted (it only affects the in-conversation flow). diff --git a/tests/fixtures/typo-fix-issue.md b/tests/fixtures/typo-fix-issue.md new file mode 100644 index 0000000..94bd449 --- /dev/null +++ b/tests/fixtures/typo-fix-issue.md @@ -0,0 +1,37 @@ +# Fixture: typo-fix-issue + +This file is a fixture for reproducible token measurement of the +gh-issue-driven plugin's token-efficiency flags (`auto-size`, `auto-skip`, +`--with-plan`, `--parallel`). See [CONTRIBUTING.md](../../CONTRIBUTING.md#measuring-token-consumption-rtk-gain) +for the measurement procedure. + +The body below is what an operator would pass to `gh issue create --body-file +tests/fixtures/typo-fix-issue.md`, simulating a "small issue" that the +`gate1.size_heuristic` is designed to short-circuit. Use it for baseline-vs-PR +measurement of `auto-size` and (when paired with a docs-only diff) `auto-skip`. + +The exact body length and label expectations are committed alongside the +markdown so a future change to the heuristic defaults can be reflected here +without losing reproducibility. + +--- + +## Expected classification + +- Body length: ~250 characters (well under the default `small_body_max_chars=500`). +- Suggested labels: `documentation` (matches default `small_labels`). +- Heuristic result (default config): **small** — `auto-size` should short-circuit gate1. + +## Issue body (paste this into `gh issue create`) + +```markdown +## Summary + +The README "60-second quickstart" section says "60-seconds" with a hyphen in +one place and "60 seconds" without elsewhere. Pick one and use it consistently. + +## Acceptance Criteria + +- [ ] README.md uses one consistent spelling. +- [ ] README.ja.md uses the same convention. +```