From efb8bfd82a4f017b83309864da4302028a1edbc7 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Tue, 19 May 2026 09:41:49 -0700 Subject: [PATCH 01/41] documentation update --- AGENTS.md | 2 +- README.md | 17 +++++++++++------ docs/release.md | 5 +++-- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index da02139..7b7489e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -24,7 +24,7 @@ Treat `AGENTS.md` as part of the codebase's invariants, not documentation. A dri [--live-migrations-dir DIR] [--in-repo-taste [PATH]] [--force]` - Taste: `continuous-refactoring taste [--global] [--interview|--upgrade|--refine] - [--with codex|claude --model --effort ] + [--with codex|claude --model --effort EFFORT] [--force]` - Run once: `continuous-refactoring run-once --with codex|claude --model [common targeting/validation flags]` diff --git a/README.md b/README.md index 802c7c4..0148da3 100644 --- a/README.md +++ b/README.md @@ -46,6 +46,10 @@ uv pip install -e . That gives you the `continuous-refactoring` command. +The CLI itself can be installed without `uv`, but the default validation command +is `uv run pytest`. Pass `--validation-command pytest` or a project-specific +script when the target repo does not use `uv`. + Maintainers: see the [release checklist](https://github.com/bigH/continuous-refactoring/blob/main/docs/release.md). ## Fastest way to get one refactor @@ -151,8 +155,9 @@ These flags are not mutually exclusive, but only the highest-priority populated - `--scope-instruction "clean up the auth module"` — extra free-text scoping. If selected file patterns resolve nothing, this becomes the useful fallback context. If `--globs` or `--extensions` match no tracked files and there is no -`--scope-instruction`, `run` completes successfully with zero refactor actions. -`--paths` is literal input and is not filtered through `git ls-files`. +`--scope-instruction`, `run` completes successfully with zero refactor actions; +`run-once` falls back to a no-file `general refactoring` target. `--paths` is +literal input and is not filtered through `git ls-files`. If you provide none of `--targets`, `--globs`, `--extensions`, or `--paths`, then `run` and `run-once` require `--scope-instruction`; the driver still @@ -190,7 +195,7 @@ continuous-refactoring migration refine --file feedback.md --with - `--default-effort` — default effort for run calls. Defaults to `low`. Valid labels are `low`, `medium`, `high`, `xhigh`. - `--max-allowed-effort` — cap for target overrides and migration escalation. Defaults to `xhigh`. - `--repo-root PATH` — repository root; defaults to the current directory. -- `--validation-command` — defaults to `uv run pytest`. Swap it for whatever keeps your repo honest. +- `--validation-command` — defaults to `uv run pytest`. This is parsed with `shlex.split` and run without a shell, so simple commands like `pytest -q` work, but shell syntax such as `cmd && cmd`, pipes, redirects, or leading `VAR=value` assignments is not interpreted. Put compound validation in a script. - `--timeout` — per-agent-call timeout in seconds. - `--show-agent-logs` / `--show-command-logs` — mirror output to your terminal instead of just logging. - `--refactoring-prompt` — override the default refactoring prompt. @@ -219,9 +224,9 @@ continuous-refactoring migration refine --file feedback.md --with Each run writes to `$TMPDIR/continuous-refactoring//`: - `summary.json` — rolling status, counts, per-attempt stats -- `events.jsonl` — structured event log with call roles such as `classify`, - `planning.`, `phase.ready-check`, `phase.execute`, and - `phase.validation` +- `events.jsonl` — structured event log with call roles such as + `scope-expansion`, `classify`, `planning.`, `planning.publish`, + `phase.ready-check`, `phase.execute`, and `phase.validation` - `run.log` — human-readable log - `attempt-NNN/[retry-NN/]refactor/` — per-attempt agent + test stdout/stderr - `baseline/initial/` — baseline validation stdout/stderr before work starts diff --git a/docs/release.md b/docs/release.md index 6938ffc..9303066 100644 --- a/docs/release.md +++ b/docs/release.md @@ -54,8 +54,9 @@ bumps: YAML cannot enforce these repository settings: -1. In GitHub branch protection or rulesets, require the `validate` status check - from the `PR Title` workflow before merging to `main`. +1. In GitHub branch protection or rulesets, require the `pytest` status check + from the `Test` workflow and the `validate` status check from the `PR Title` + workflow before merging to `main`. 2. Add a repository secret named `RELEASE_PLEASE_TOKEN` for Release Please. A fine-grained token needs Contents read/write and Pull requests read/write for `bigH/continuous-refactoring`. Without this secret, the workflow falls back From 74883c94426c729d20320911f7fc62b07068b9bd Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Tue, 19 May 2026 22:50:23 -0700 Subject: [PATCH 02/41] remove taste effort --- AGENTS.md | 6 ++++-- README.md | 16 ++++++++++------ src/continuous_refactoring/cli.py | 17 ++++------------- tests/conftest.py | 3 +-- tests/test_taste_interview.py | 25 +++++++++++++++++++------ tests/test_taste_refine.py | 8 +++----- tests/test_taste_upgrade.py | 12 ++++++------ uv.lock | 2 +- 8 files changed, 48 insertions(+), 41 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 7b7489e..a326bbc 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -24,8 +24,10 @@ Treat `AGENTS.md` as part of the codebase's invariants, not documentation. A dri [--live-migrations-dir DIR] [--in-repo-taste [PATH]] [--force]` - Taste: `continuous-refactoring taste [--global] [--interview|--upgrade|--refine] - [--with codex|claude --model --effort EFFORT] + [--with codex|claude --model ] [--force]` + Active taste agent modes require `--with` and `--model`; the taste agent + always runs at fixed `medium` effort. - Run once: `continuous-refactoring run-once --with codex|claude --model [common targeting/validation flags]` - Run loop: `continuous-refactoring run --with codex|claude --model @@ -139,7 +141,7 @@ active phase explicitly names `loop.py` in scope. - **Migration terminology split** (`migrations.py`, `planning.py`, `prompts.py`) — manifest `precondition` gates phase start; phase markdown `## Definition of Done` governs completion. - **Run-level baseline validation** (`loop.py`) — `run-once`, `run`, and `--focus-on-live-migrations` run the configured validation command after the clean-worktree check and before routing/refactoring. A red baseline stops as `baseline_failed`, not migration human review. - **Phase execution validation gate** (`phases.py`, `prompts.py`, `loop.py`) — a migration phase is complete only after host-side full validation passes. `execute_phase()` retries validation-red attempts from `head_before` up to the effective `--max-attempts` budget, and the phase prompt must include the literal configured validation command plus the phase file's Definition of Done as the completion contract. -- **Effort budgeting** (`effort.py`, `loop.py`, `migration_tick.py`, `planning.py`) — `run` / `run-once` default to `--default-effort low` and `--max-allowed-effort xhigh`; there is no `--effort` alias on those commands. Target `effort-override` changes that target's default but is still capped. Migration `required_effort` above the cap defers the phase without failing the run. +- **Effort budgeting** (`effort.py`, `loop.py`, `migration_tick.py`, `planning.py`) — `run` / `run-once` default to `--default-effort low` and `--max-allowed-effort xhigh`; there is no `--effort` alias on those commands. Target `effort-override` changes that target's default but is still capped. Migration `required_effort` above the cap defers the phase without failing the run. Taste agent actions do not accept `--effort`; they always use fixed `medium` effort. - **Taste injection** — every prompt includes a `## Taste` section. `tests/test_prompts.py` enforces this via `_TASTE_INJECTED_PROMPTS`. Do not drop it. - **Taste read boundary** (`config.py`, `cli.py`, `loop.py`) — `load_taste()` translates unreadable project/global taste reads into `ContinuousRefactorError`; CLI stale-taste checks and loop taste loading must treat that boundary failure as non-fatal and skip/fall back instead of leaking raw `OSError`/`PermissionError`. - **Repo-local taste routing** (`config.py`, `cli.py`) — `ProjectEntry.repo_taste_path` is stored repo-relative in the XDG manifest and resolved through `resolve_project_taste_path()`. Keep `init`, `taste`, stale warnings, and run prompt loading on that helper so the active taste path does not drift. diff --git a/README.md b/README.md index 0148da3..3152d00 100644 --- a/README.md +++ b/README.md @@ -105,8 +105,8 @@ continuous-refactoring init --in-repo-taste # 2. (Optional) Write your refactoring taste — either edit the file, have an agent interview you, # or refine an existing draft collaboratively -continuous-refactoring taste --interview --with codex --model gpt-5 --effort high -continuous-refactoring taste --refine --with codex --model gpt-5 --effort high +continuous-refactoring taste --interview --with codex --model gpt-5 +continuous-refactoring taste --refine --with codex --model gpt-5 # 3. Do one pass continuous-refactoring run-once \ @@ -122,12 +122,15 @@ continuous-refactoring run \ --sleep 5 ``` +Active taste agent modes require `--with` and `--model`; taste agent sessions +always run at fixed `medium` effort. + ## Subcommands | Command | What it does | |---|---| | `init` | Registers this directory as a project, creates a default `taste.md`, and can store `--live-migrations-dir` or `--in-repo-taste`. | -| `taste` | Prints the active taste file path. Add `--interview` to have an agent author it, `--refine` to iteratively improve an existing taste doc, `--upgrade` to refresh stale taste dimensions, `--global` for the shared file, and `--force` to let `--interview` overwrite custom content after writing a `.bak`. | +| `taste` | Prints the active taste file path. Add `--interview` to have an agent author it, `--refine` to iteratively improve an existing taste doc, `--upgrade` to refresh stale taste dimensions, `--global` for the shared file, and `--force` to let `--interview` overwrite custom content after writing a `.bak`. Active agent modes require `--with` and `--model`, then run at fixed `medium` effort. | | `run-once` | Single pass on one resolved target. No retry. If there is a diff and validation passes, it commits locally and prints the diffstat. | | `run` | The loop. Iterates refactor actions, retries on failure, and commits successful changes locally. Add `--focus-on-live-migrations` to bypass targeting and work only on eligible live migrations. | | `upgrade` | Checks that the global config manifest is current, rewrites it idempotently, and warns if the global taste file is stale. | @@ -172,8 +175,9 @@ scope text as context for that target. - `migration doctor ` / `migration doctor --all` — read-only consistency checks. Doctor reports problems; it does not repair them. - `migration review --with ... --model ... --effort ...` — resolves an `awaiting_human_review` migration through a staged workspace. - `migration refine (--message |--file ) --with ... --model ... --effort ... [--show-agent-logs]` — adds user feedback to a planning or unexecuted ready migration and resumes planning through the `revise` step when reopening ready work. -- `taste --refine` — opens a collaborative editing session for the taste file. The agent keeps refining until you tell it to write, then the session ends automatically after the settled write. -- `taste --upgrade` — re-interviews for taste dimensions added since your last version. No-op when already current; use `taste --refine` if you want to rework the doc anyway. +- `taste --refine --with ... --model ...` — opens a collaborative editing session for the taste file. The agent keeps refining until you tell it to write, then the session ends automatically after the settled write. +- `taste --upgrade --with ... --model ...` — re-interviews for taste dimensions added since your last version. No-op when already current; use `taste --refine` if you want to rework the doc anyway. +- Taste agent sessions always use fixed `medium` effort. - `taste --force` — only applies to `--interview`; it allows a customized taste file to be overwritten after backing it up to `taste.md.bak`. Canonical migration commands: @@ -253,7 +257,7 @@ The taste file is a short bullet list of your refactoring preferences. It gets i - Project taste: `~/.local/share/continuous-refactoring/projects//taste.md`, or the repo-local path chosen with `init --in-repo-taste [PATH]` - Global taste: `~/.local/share/continuous-refactoring/global/taste.md` -Project taste wins over global. Use `taste` to print the active path, `taste --interview` to bootstrap one, `taste --refine` to rework it with an agent, or edit the file directly any time. +Project taste wins over global. Use `taste` to print the active path, `taste --interview --with ... --model ...` to bootstrap one, `taste --refine --with ... --model ...` to rework it with an agent, or edit the file directly any time. ## Larger refactorings diff --git a/src/continuous_refactoring/cli.py b/src/continuous_refactoring/cli.py index 290c380..2a357f0 100644 --- a/src/continuous_refactoring/cli.py +++ b/src/continuous_refactoring/cli.py @@ -40,6 +40,7 @@ "warning: global taste is out of date — " "run 'continuous-refactoring taste --upgrade' to update." ) +_TASTE_AGENT_EFFORT = "medium" def _version_banner() -> str: @@ -191,11 +192,6 @@ def _add_taste_parser(subparsers: argparse._SubParsersAction) -> None: default=None, help="Model name for --interview, --upgrade, or --refine.", ) - taste_parser.add_argument( - "--effort", - default=None, - help="Effort level for --interview, --upgrade, or --refine.", - ) taste_parser.add_argument( "--force", action="store_true", @@ -620,9 +616,7 @@ def _active_taste_mode(args: argparse.Namespace) -> str | None: def _taste_agent_flags_set(args: argparse.Namespace) -> bool: - return any( - getattr(args, name, None) is not None for name in ("agent", "model", "effort") - ) + return any(getattr(args, name, None) is not None for name in ("agent", "model")) def _require_taste_action_flags( @@ -630,14 +624,12 @@ def _require_taste_action_flags( action: str, agent: str | None, model: str | None, - effort: str | None, ) -> None: missing = [ flag for flag, value in ( ("--with", agent), ("--model", model), - ("--effort", effort), ) if not value ] @@ -661,7 +653,7 @@ def _run_taste_agent( returncode = run_agent_interactive_until_settled( args.agent, args.model, - args.effort, + _TASTE_AGENT_EFFORT, prompt, Path.cwd().resolve(), content_path=path, @@ -729,7 +721,7 @@ def _handle_taste(args: argparse.Namespace) -> None: if mode is None: if _taste_agent_flags_set(args): print( - "Error: --with/--model/--effort require --interview, --upgrade, or --refine.", + "Error: --with/--model require --interview, --upgrade, or --refine.", file=sys.stderr, ) raise SystemExit(2) @@ -741,7 +733,6 @@ def _handle_taste(args: argparse.Namespace) -> None: action=mode, agent=getattr(args, "agent", None), model=getattr(args, "model", None), - effort=getattr(args, "effort", None), ) return _TASTE_MODE_HANDLERS[mode](args) diff --git a/tests/conftest.py b/tests/conftest.py index 891004f..9023866 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -122,6 +122,7 @@ def fake( assert content_path == extract_taste_path(prompt) assert settle_path == extract_settle_path(prompt) if captured is not None: + captured["effort"] = _effort captured["prompt"] = prompt captured["content_path"] = str(content_path) captured["settle_path"] = str(settle_path) @@ -143,7 +144,6 @@ def make_taste_args( global_: bool = False, agent: str | None = None, model: str | None = None, - effort: str | None = None, force: bool = False, ) -> argparse.Namespace: return argparse.Namespace( @@ -153,7 +153,6 @@ def make_taste_args( refine=mode == "refine", agent=agent, model=model, - effort=effort, force=force, ) diff --git a/tests/test_taste_interview.py b/tests/test_taste_interview.py index 7a75d28..15c9006 100644 --- a/tests/test_taste_interview.py +++ b/tests/test_taste_interview.py @@ -32,7 +32,6 @@ def _interview_args( force: bool = False, agent: str | None = "codex", model: str | None = "m", - effort: str | None = "high", ) -> argparse.Namespace: return make_taste_args( "interview", @@ -40,7 +39,6 @@ def _interview_args( force=force, agent=agent, model=model, - effort=effort, ) # --------------------------------------------------------------------------- @@ -53,7 +51,7 @@ def test_interview_requires_agent_flags( capsys: pytest.CaptureFixture[str], ) -> None: monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path / "xdg")) - args = _interview_args(agent=None, model=None, effort=None) + args = _interview_args(agent=None, model=None) with pytest.raises(SystemExit) as exc_info: _handle_taste(args) assert exc_info.value.code == 2 @@ -67,10 +65,11 @@ def test_interview_rejects_agent_flags_without_interview( ) -> None: monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path / "xdg")) with pytest.raises(SystemExit) as exc_info: - _handle_taste(make_taste_args(agent="codex", model="m", effort="high")) + _handle_taste(make_taste_args(agent="codex", model="m")) assert exc_info.value.code == 2 err = capsys.readouterr().err assert "require --interview, --upgrade, or --refine" in err + assert "--effort" not in err def test_force_requires_interview( @@ -285,6 +284,7 @@ def test_interview_prompt_includes_existing_content( _handle_taste(_interview_args(force=True)) prompt = captured["prompt"] + assert captured["effort"] == "medium" assert "Existing taste content" in prompt assert "starting draft" in prompt assert existing.strip() in prompt @@ -307,11 +307,24 @@ def test_taste_subparser_accepts_interview_flags() -> None: args = parser.parse_args( [ "taste", "--interview", "--with", "codex", - "--model", "gpt-x", "--effort", "xhigh", "--force", + "--model", "gpt-x", "--force", ] ) assert args.interview is True assert args.agent == "codex" assert args.model == "gpt-x" - assert args.effort == "xhigh" assert args.force is True + + +@pytest.mark.parametrize("mode", [None, "--interview", "--upgrade", "--refine"]) +def test_taste_subparser_rejects_effort(mode: str | None) -> None: + parser = build_parser() + argv = ["taste"] + if mode is not None: + argv.append(mode) + argv.extend(["--with", "codex", "--model", "gpt-x", "--effort", "high"]) + + with pytest.raises(SystemExit) as exc_info: + parser.parse_args(argv) + + assert exc_info.value.code == 2 diff --git a/tests/test_taste_refine.py b/tests/test_taste_refine.py index 7b57e56..1fb964b 100644 --- a/tests/test_taste_refine.py +++ b/tests/test_taste_refine.py @@ -15,7 +15,6 @@ def _refine_args( global_: bool = False, agent: str | None = "codex", model: str | None = "m", - effort: str | None = "high", force: bool = False, ) -> argparse.Namespace: return make_taste_args( @@ -23,7 +22,6 @@ def _refine_args( global_=global_, agent=agent, model=model, - effort=effort, force=force, ) @@ -34,7 +32,7 @@ def test_refine_requires_agent_flags( capsys: pytest.CaptureFixture[str], ) -> None: monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path / "xdg")) - args = _refine_args(agent=None, model=None, effort=None) + args = _refine_args(agent=None, model=None) with pytest.raises(SystemExit) as exc_info: _handle_taste(args) @@ -139,6 +137,7 @@ def test_refine_prompt_allows_open_ended_improvement_and_explicit_write_handoff( assert "do not modify either file again" in prompt assert "Do not add one unless the user explicitly asks for it" in prompt assert "- Keep helpers honest." in prompt + assert captured["effort"] == "medium" assert captured["content_path"] == str(taste_path) assert captured["settle_path"] == str(taste_path.with_name("taste.md.done")) @@ -146,7 +145,7 @@ def test_refine_prompt_allows_open_ended_improvement_and_explicit_write_handoff( def test_taste_subparser_accepts_refine_flags() -> None: parser = build_parser() args = parser.parse_args( - ["taste", "--refine", "--with", "codex", "--model", "gpt-x", "--effort", "xhigh"], + ["taste", "--refine", "--with", "codex", "--model", "gpt-x"], ) assert args.refine is True @@ -154,7 +153,6 @@ def test_taste_subparser_accepts_refine_flags() -> None: assert args.upgrade is False assert args.agent == "codex" assert args.model == "gpt-x" - assert args.effort == "xhigh" @pytest.mark.parametrize("other_mode", ["--interview", "--upgrade"]) diff --git a/tests/test_taste_upgrade.py b/tests/test_taste_upgrade.py index d1dbbcd..7bb505c 100644 --- a/tests/test_taste_upgrade.py +++ b/tests/test_taste_upgrade.py @@ -27,7 +27,6 @@ def _upgrade_args( global_: bool = False, agent: str | None = "codex", model: str | None = "m", - effort: str | None = "high", force: bool = False, ) -> argparse.Namespace: return make_taste_args( @@ -35,7 +34,6 @@ def _upgrade_args( global_=global_, agent=agent, model=model, - effort=effort, force=force, ) @@ -112,6 +110,7 @@ def test_upgrade_prompt_mentions_legacy_and_new_dimensions( _handle_taste(_upgrade_args()) prompt = captured["prompt"] + assert captured["effort"] == "medium" assert "no version header" in prompt assert "legacy" in prompt.lower() assert "large-scope decisions" in prompt @@ -135,7 +134,7 @@ def test_upgrade_requires_agent_flags_when_stale( taste_path.write_text("- Legacy.\n", encoding="utf-8") with pytest.raises(SystemExit) as exc_info: - _handle_taste(_upgrade_args(agent=None, model=None, effort=None)) + _handle_taste(_upgrade_args(agent=None, model=None)) assert exc_info.value.code == 2 err = capsys.readouterr().err @@ -154,7 +153,7 @@ def test_upgrade_requires_agent_flags_when_global_taste_is_stale( with pytest.raises(SystemExit) as exc_info: _handle_taste( - _upgrade_args(global_=True, agent=None, model=None, effort=None), + _upgrade_args(global_=True, agent=None, model=None), ) assert exc_info.value.code == 2 @@ -169,7 +168,7 @@ def test_upgrade_noop_skips_agent_flag_check( taste_path = init_taste_project(tmp_path, monkeypatch) taste_path.write_text(default_taste_text(), encoding="utf-8") - _handle_taste(_upgrade_args(agent=None, model=None, effort=None)) + _handle_taste(_upgrade_args(agent=None, model=None)) out = capsys.readouterr().out.strip() assert "taste already current" in out @@ -199,11 +198,12 @@ def test_upgrade_rejects_force( def test_taste_subparser_accepts_upgrade_flags() -> None: parser = build_parser() args = parser.parse_args( - ["taste", "--upgrade", "--with", "codex", "--model", "m", "--effort", "high"], + ["taste", "--upgrade", "--with", "codex", "--model", "m"], ) assert args.upgrade is True assert args.interview is False assert args.agent == "codex" + assert args.model == "m" def test_taste_interview_and_upgrade_mutually_exclusive() -> None: diff --git a/uv.lock b/uv.lock index 1b32d35..e6f83e8 100644 --- a/uv.lock +++ b/uv.lock @@ -13,7 +13,7 @@ wheels = [ [[package]] name = "continuous-refactoring" -version = "0.2.0" +version = "0.3.0" source = { editable = "." } [package.dev-dependencies] From a11fc19157ff2c14a09e5add1e9eef4b84a3cac0 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 10:29:57 -0700 Subject: [PATCH 03/41] remove effort from review and redundant command --- AGENTS.md | 13 +- README.md | 18 +- src/continuous_refactoring/cli.py | 17 -- src/continuous_refactoring/migration_cli.py | 8 +- src/continuous_refactoring/review_cli.py | 72 ++--- tests/test_cli_migrations.py | 168 +++++++++-- tests/test_cli_review.py | 316 +------------------- 7 files changed, 203 insertions(+), 409 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index a326bbc..a62cf41 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -43,12 +43,11 @@ Treat `AGENTS.md` as part of the codebase's invariants, not documentation. A dri `continuous-refactoring migration doctor ` / `continuous-refactoring migration doctor --all` - Review migrations: `continuous-refactoring migration review - --with codex|claude --model --effort ` - (top-level `review list` / `review perform --with ... --model ... - --effort ...` remain compatibility wrappers) + --with codex|claude --model ` + (`review list` remains a compatibility shortcut for awaiting-review listing) - Refine migration planning: `continuous-refactoring migration refine (--message |--file ) --with codex|claude --model - --effort [--show-agent-logs]` + [--show-agent-logs]` No lint, no typecheck, no formatter, no pre-commit. GitHub Actions `Test` runs `uv run pytest`. **Pytest is the only code gate.** GitHub Actions @@ -135,13 +134,13 @@ active phase explicitly names `loop.py` in scope. - **One-step planning engine** (`planning.py`) — product planning entry points call `run_next_planning_step()` so one action runs exactly `PlanningState.next_step`, records accepted stdout/state in an off-live workspace, and publishes through `planning_publish.py`. Failed current-step output is never durable resume input. `run_planning` is intentionally not package-exported. - **Planning resume scheduling** (`migration_tick.py`, `loop.py`, `routing_pipeline.py`) — normal automation runs one eligible `status: planning` step before ready/in-progress phase ticks and before source-target routing. Missing or invalid `.planning/state.json` blocks automation with planning failure evidence; `status: planning` must never enter phase ready-check or phase execution. - **Focused planning reselection** (`loop.py`, `migration_tick.py`) — focused mode tracks planning migrations abandoned with `new-target` only in memory for the current run, skips them while another planning or phase candidate is eligible, and retries them only when no alternative remains. Do not persist this as `cooldown_until`; planning step failure is not a durable readiness deferral. -- **Review CLI boundary** (`cli.py`, `review_cli.py`) — `cli.py` owns parser wiring; staged migration review internals live in `review_cli.py`, publish only through `planning_publish.py`, and stay internal/out of package-root `_SUBMODULES`. Top-level `review perform` is only a compatibility wrapper around this path. +- **Review CLI boundary** (`cli.py`, `review_cli.py`) — `cli.py` owns parser wiring; staged migration review internals live in `review_cli.py`, publish only through `planning_publish.py`, and stay internal/out of package-root `_SUBMODULES`. Top-level `review list` is only a compatibility listing shortcut; canonical review mutation is `migration review`. - **Migration CLI boundary** (`cli.py`, `migration_cli.py`) — `cli.py` owns parser wiring only; `migration_cli.py` owns namespace dispatch, read-only list/doctor behavior, and the contained slug/path resolver used by mutation commands. Mutating subcommands delegate their internals to focused modules such as `review_cli.py` or the planning refine entry point. Resolver targets must stay direct visible children of the configured live migrations root and reject symlink, outside, parent-traversal, and ambiguous paths. -- **Human-review gating** (`planning.py`, `migration_tick.py`, `review_cli.py`) — migrations with `awaiting_human_review=true` must be invisible to automated migration ticks/ready-checks until canonical `migration review` clears the flag through staged publish; top-level `review perform` routes to the same compatibility path. `migration refine` may reopen an unexecuted ready migration to planning, but it is user feedback, not review approval. +- **Human-review gating** (`planning.py`, `migration_tick.py`, `review_cli.py`) — migrations with `awaiting_human_review=true` must be invisible to automated migration ticks/ready-checks until canonical `migration review` clears the flag through staged publish. `migration refine` may reopen an unexecuted ready migration to planning, but it is user feedback, not review approval. - **Migration terminology split** (`migrations.py`, `planning.py`, `prompts.py`) — manifest `precondition` gates phase start; phase markdown `## Definition of Done` governs completion. - **Run-level baseline validation** (`loop.py`) — `run-once`, `run`, and `--focus-on-live-migrations` run the configured validation command after the clean-worktree check and before routing/refactoring. A red baseline stops as `baseline_failed`, not migration human review. - **Phase execution validation gate** (`phases.py`, `prompts.py`, `loop.py`) — a migration phase is complete only after host-side full validation passes. `execute_phase()` retries validation-red attempts from `head_before` up to the effective `--max-attempts` budget, and the phase prompt must include the literal configured validation command plus the phase file's Definition of Done as the completion contract. -- **Effort budgeting** (`effort.py`, `loop.py`, `migration_tick.py`, `planning.py`) — `run` / `run-once` default to `--default-effort low` and `--max-allowed-effort xhigh`; there is no `--effort` alias on those commands. Target `effort-override` changes that target's default but is still capped. Migration `required_effort` above the cap defers the phase without failing the run. Taste agent actions do not accept `--effort`; they always use fixed `medium` effort. +- **Effort budgeting** (`effort.py`, `loop.py`, `migration_tick.py`, `planning.py`) — `run` / `run-once` default to `--default-effort low` and `--max-allowed-effort xhigh`; there is no `--effort` alias on those commands. Target `effort-override` changes that target's default but is still capped. Migration `required_effort` above the cap defers the phase without failing the run. Manual `migration review` and `migration refine` operations use fixed internal `high` effort. Taste agent actions do not accept `--effort`; they always use fixed `medium` effort. - **Taste injection** — every prompt includes a `## Taste` section. `tests/test_prompts.py` enforces this via `_TASTE_INJECTED_PROMPTS`. Do not drop it. - **Taste read boundary** (`config.py`, `cli.py`, `loop.py`) — `load_taste()` translates unreadable project/global taste reads into `ContinuousRefactorError`; CLI stale-taste checks and loop taste loading must treat that boundary failure as non-fatal and skip/fall back instead of leaking raw `OSError`/`PermissionError`. - **Repo-local taste routing** (`config.py`, `cli.py`) — `ProjectEntry.repo_taste_path` is stored repo-relative in the XDG manifest and resolved through `resolve_project_taste_path()`. Keep `init`, `taste`, stale warnings, and run prompt loading on that helper so the active taste path does not drift. diff --git a/README.md b/README.md index 3152d00..610fa21 100644 --- a/README.md +++ b/README.md @@ -137,10 +137,10 @@ always run at fixed `medium` effort. | `migration list` | Lists visible migrations. Add `--status ` or `--awaiting-review` to filter. | | `migration doctor ` | Validates one visible migration's consistency. | | `migration doctor --all` | Validates every visible migration plus internal transaction state. | -| `migration review ` | Starts staged review for a migration awaiting human review. Requires `--with`, `--model`, and `--effort`. | -| `migration refine ` | Records feedback for a planning or unexecuted ready migration and runs one staged planning revision. Requires `--message ` or `--file `, plus `--with`, `--model`, and `--effort`; add `--show-agent-logs` to mirror the planning agent. | +| `migration review ` | Starts staged review for a migration awaiting human review. Requires `--with` and `--model`; review runs at fixed internal `high` effort. | +| `migration refine ` | Records feedback for a planning or unexecuted ready migration and runs one staged planning revision. Requires `--message ` or `--file `, plus `--with` and `--model`; refine runs at fixed internal `high` effort. Add `--show-agent-logs` to mirror the planning agent. | -Legacy `review list` and `review perform ` remain compatibility aliases; prefer `migration list --awaiting-review` and `migration review`. +Legacy `review list` remains a compatibility shortcut for `migration list --awaiting-review`. ## Targeting / Useful flags @@ -173,8 +173,8 @@ scope text as context for that target. - `init --in-repo-taste [PATH]` — stores this project's taste file in the repo and remembers the repo-relative path. Defaults to `.continuous-refactoring/taste.md`; re-run `init --in-repo-taste ...` to choose a different path. - `migration list` — shows visible migrations; `--awaiting-review` narrows to human-review handoffs. - `migration doctor ` / `migration doctor --all` — read-only consistency checks. Doctor reports problems; it does not repair them. -- `migration review --with ... --model ... --effort ...` — resolves an `awaiting_human_review` migration through a staged workspace. -- `migration refine (--message |--file ) --with ... --model ... --effort ... [--show-agent-logs]` — adds user feedback to a planning or unexecuted ready migration and resumes planning through the `revise` step when reopening ready work. +- `migration review --with ... --model ...` — resolves an `awaiting_human_review` migration through a staged workspace at fixed internal `high` effort. +- `migration refine (--message |--file ) --with ... --model ... [--show-agent-logs]` — adds user feedback to a planning or unexecuted ready migration and resumes planning through the `revise` step when reopening ready work at fixed internal `high` effort. - `taste --refine --with ... --model ...` — opens a collaborative editing session for the taste file. The agent keeps refining until you tell it to write, then the session ends automatically after the settled write. - `taste --upgrade --with ... --model ...` — re-interviews for taste dimensions added since your last version. No-op when already current; use `taste --refine` if you want to rework the doc anyway. - Taste agent sessions always use fixed `medium` effort. @@ -188,9 +188,9 @@ continuous-refactoring migration list --status planning continuous-refactoring migration list --awaiting-review continuous-refactoring migration doctor continuous-refactoring migration doctor --all -continuous-refactoring migration review --with codex --model gpt-5 --effort high -continuous-refactoring migration refine --message "split the risky phase" --with codex --model gpt-5 --effort high -continuous-refactoring migration refine --file feedback.md --with codex --model gpt-5 --effort high +continuous-refactoring migration review --with codex --model gpt-5 +continuous-refactoring migration refine --message "split the risky phase" --with codex --model gpt-5 +continuous-refactoring migration refine --file feedback.md --with codex --model gpt-5 ``` ### Shared `run` / `run-once` flags @@ -332,7 +332,7 @@ Before executing a phase, a ready-check agent verifies that the current phase pr - **ready: yes** — phase executes; on green tests, the phase is marked done, any prior deferral markers are cleared, and the migration advances immediately to the next phase. - **ready: no** — manifest activity is bumped, a retry cooldown is started, and a future `wake_up_on` is recorded when needed; the tick moves on. -- **ready: unverifiable** — the migration is flagged `awaiting_human_review` and put on cooldown. Automated migration ticks skip flagged migrations until review clears the flag. Use `migration list --awaiting-review` to find it and `migration review --with ... --model ... --effort ...` to resolve it interactively. +- **ready: unverifiable** — the migration is flagged `awaiting_human_review` and put on cooldown. Automated migration ticks skip flagged migrations until review clears the flag. Use `migration list --awaiting-review` to find it and `migration review --with ... --model ...` to resolve it interactively. Human-facing migration references use the relative phase spec path, for example `phase-2-failure-report.md`. The manifest cursor stores the phase `name`, not a numeric index. diff --git a/src/continuous_refactoring/cli.py b/src/continuous_refactoring/cli.py index 2a357f0..5e744da 100644 --- a/src/continuous_refactoring/cli.py +++ b/src/continuous_refactoring/cli.py @@ -262,17 +262,6 @@ def _add_review_parser(subparsers: argparse._SubParsersAction) -> None: review_parser.set_defaults(handler=handle_review) review_sub = review_parser.add_subparsers(dest="review_command") review_sub.add_parser("list", help="List migrations flagged for review.") - perform_parser = review_sub.add_parser( - "perform", - help="Perform review on a flagged migration.", - ) - perform_parser.add_argument("migration", help="Migration name to review.") - perform_parser.add_argument( - "--with", dest="agent", choices=("codex", "claude"), required=True, - help="Agent backend.", - ) - perform_parser.add_argument("--model", required=True, help="Model name.") - perform_parser.add_argument("--effort", required=True, help="Effort level.") def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: @@ -324,9 +313,6 @@ def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: help="Agent backend.", ) review_parser.add_argument("--model", required=True, help="Model name.") - review_parser.add_argument( - "--effort", choices=EFFORT_TIERS, required=True, help="Effort level." - ) refine_parser = migration_sub.add_parser( "refine", @@ -345,9 +331,6 @@ def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: help="Agent backend.", ) refine_parser.add_argument("--model", required=True, help="Model name.") - refine_parser.add_argument( - "--effort", choices=EFFORT_TIERS, required=True, help="Effort level." - ) refine_parser.add_argument( "--show-agent-logs", action="store_true", diff --git a/src/continuous_refactoring/migration_cli.py b/src/continuous_refactoring/migration_cli.py index 5bf0f88..6e3f3a5 100644 --- a/src/continuous_refactoring/migration_cli.py +++ b/src/continuous_refactoring/migration_cli.py @@ -12,6 +12,7 @@ create_run_artifacts, ) from continuous_refactoring.config import resolve_live_migrations_dir, resolve_project +from continuous_refactoring.effort import EffortTier from continuous_refactoring.migration_consistency import ( MigrationConsistencyFinding, check_migration_consistency, @@ -43,6 +44,7 @@ ] _MIGRATION_USAGE = "Usage: continuous-refactoring migration {list,doctor,review,refine}" +_MIGRATION_MANUAL_AGENT_EFFORT: EffortTier = "high" _MISSING_TEXT = "(none)" @@ -151,7 +153,7 @@ def handle_migration_review(args: argparse.Namespace) -> None: project_state_dir=context.project_state_dir, agent=args.agent, model=args.model, - effort=args.effort, + effort=_MIGRATION_MANUAL_AGENT_EFFORT, taste=taste, ) ) @@ -187,7 +189,7 @@ def handle_migration_refine(args: argparse.Namespace) -> None: context.repo_root, agent=args.agent, model=args.model, - effort=args.effort, + effort=_MIGRATION_MANUAL_AGENT_EFFORT, test_command="migration refine", ) try: @@ -202,7 +204,7 @@ def handle_migration_refine(args: argparse.Namespace) -> None: artifacts=artifacts, agent=args.agent, model=args.model, - effort=args.effort, + effort=_MIGRATION_MANUAL_AGENT_EFFORT, log_mirroring=LogMirroring( agent=bool(getattr(args, "show_agent_logs", False)), ), diff --git a/src/continuous_refactoring/review_cli.py b/src/continuous_refactoring/review_cli.py index 4c63b07..1160a4d 100644 --- a/src/continuous_refactoring/review_cli.py +++ b/src/continuous_refactoring/review_cli.py @@ -10,17 +10,17 @@ from continuous_refactoring.agent import run_agent_interactive from continuous_refactoring.artifacts import ContinuousRefactorError from continuous_refactoring.config import ( - load_taste, resolve_live_migrations_dir, resolve_project, ) -from continuous_refactoring.migration_cli import MigrationTarget, resolve_migration_target +from continuous_refactoring.migration_cli import MigrationTarget from continuous_refactoring.migration_consistency import ( check_migration_consistency, has_blocking_consistency_findings, iter_visible_migration_dirs, ) from continuous_refactoring.migrations import ( + MigrationManifest, load_manifest as load_migration_manifest, phase_file_reference, resolve_current_phase, @@ -39,12 +39,11 @@ "StagedReviewRequest", "handle_review", "handle_review_list", - "handle_review_perform", "handle_staged_migration_review", "perform_staged_migration_review", ] -_REVIEW_USAGE = "Usage: continuous-refactoring review {list,perform}" +_REVIEW_USAGE = "Usage: continuous-refactoring review {list}" @dataclass(frozen=True) @@ -124,38 +123,6 @@ def handle_review_list() -> None: ) -def handle_review_perform(args: argparse.Namespace) -> None: - context = _resolve_review_context(error_code=2) - try: - target = resolve_migration_target( - live_dir=context.live_dir, - repo_root=context.repo_root, - value=args.migration, - ) - except ContinuousRefactorError as error: - print(f"Error: {error}", file=sys.stderr) - raise SystemExit(2) from error - - try: - taste = load_taste(resolve_project(context.repo_root)) - except ContinuousRefactorError as error: - print(f"Error: {error}", file=sys.stderr) - raise SystemExit(1) from error - - handle_staged_migration_review( - StagedReviewRequest( - repo_root=context.repo_root, - live_dir=context.live_dir, - target=target, - project_state_dir=context.project_state_dir, - agent=args.agent, - model=args.model, - effort=args.effort, - taste=taste, - ) - ) - - def handle_staged_migration_review( request: StagedReviewRequest, ) -> PlanningPublishResult: @@ -188,7 +155,7 @@ def perform_staged_migration_review( manifest = load_migration_manifest(manifest_path) if not manifest.awaiting_human_review: raise _ReviewCliError( - f"migration '{request.target.slug}' is not flagged for review.", + _not_awaiting_review_message(request.target.slug, manifest), 2, ) @@ -275,6 +242,35 @@ def _require_consistent_review_workspace(workspace_root: Path) -> None: ) +def _not_awaiting_review_message(slug: str, manifest: MigrationManifest) -> str: + message = ( + f"migration '{slug}' is not awaiting human review. " + "Reviewable migrations are listed by " + "`continuous-refactoring migration list --awaiting-review`." + ) + if _is_refine_eligible(manifest): + return ( + f"{message} To revise this migration instead, run " + f"`continuous-refactoring migration refine {slug} ...`." + ) + return ( + f"{message} `continuous-refactoring migration refine {slug} ...` " + "is only available for planning or unexecuted ready migrations." + ) + + +def _is_refine_eligible(manifest: MigrationManifest) -> bool: + if any(phase.done for phase in manifest.phases): + return False + if manifest.status == "planning": + return True + if manifest.status != "ready": + return False + if not manifest.phases: + return False + return manifest.current_phase == manifest.phases[0].name + + def _review_publish_error_message(error: PlanningPublishError, slug: str) -> str: message = str(error) if "stale base snapshot" not in error.result.reason: @@ -290,7 +286,5 @@ def _review_publish_error_message(error: PlanningPublishError, slug: str) -> str def handle_review(args: argparse.Namespace) -> None: if args.review_command == "list": return handle_review_list() - if args.review_command == "perform": - return handle_review_perform(args) print(_REVIEW_USAGE, file=sys.stderr) raise SystemExit(2) diff --git a/tests/test_cli_migrations.py b/tests/test_cli_migrations.py index 3702a7f..ec44201 100644 --- a/tests/test_cli_migrations.py +++ b/tests/test_cli_migrations.py @@ -75,15 +75,13 @@ def test_migration_parser_accepts_list_and_doctor() -> None: "codex", "--model", "test-model", - "--effort", - "low", ] ) assert review_args.migration_command == "review" assert review_args.target == "my-mig" assert review_args.agent == "codex" assert review_args.model == "test-model" - assert review_args.effort == "low" + assert not hasattr(review_args, "effort") def test_migration_parser_accepts_doctor_all() -> None: @@ -110,15 +108,15 @@ def test_documented_migration_commands_match_parser() -> None: "continuous-refactoring migration doctor --all", ( "continuous-refactoring migration review --with codex " - "--model gpt-5 --effort high" + "--model gpt-5" ), ( "continuous-refactoring migration refine --message " - "\"split the risky phase\" --with codex --model gpt-5 --effort high" + "\"split the risky phase\" --with codex --model gpt-5" ), ( "continuous-refactoring migration refine --file " - "feedback.md --with codex --model gpt-5 --effort high" + "feedback.md --with codex --model gpt-5" ), ) @@ -165,8 +163,6 @@ def test_migration_refine_requires_message_or_file() -> None: "codex", "--model", "test-model", - "--effort", - "low", ] ) assert missing_exit.value.code == 2 @@ -185,8 +181,6 @@ def test_migration_refine_requires_message_or_file() -> None: "codex", "--model", "test-model", - "--effort", - "low", ] ) assert both_exit.value.code == 2 @@ -202,8 +196,6 @@ def test_migration_refine_requires_message_or_file() -> None: "codex", "--model", "test-model", - "--effort", - "low", "--show-agent-logs", ] ) @@ -214,7 +206,7 @@ def test_migration_refine_requires_message_or_file() -> None: assert args.file is None assert args.agent == "codex" assert args.model == "test-model" - assert args.effort == "low" + assert not hasattr(args, "effort") assert args.show_agent_logs is True with pytest.raises(SystemExit) as command_logs_exit: @@ -229,14 +221,50 @@ def test_migration_refine_requires_message_or_file() -> None: "codex", "--model", "test-model", - "--effort", - "low", "--show-command-logs", ] ) assert command_logs_exit.value.code == 2 +@pytest.mark.parametrize( + "argv", + [ + [ + "migration", + "review", + "my-mig", + "--with", + "codex", + "--model", + "test-model", + "--effort", + "high", + ], + [ + "migration", + "refine", + "my-mig", + "--message", + "tighten it", + "--with", + "codex", + "--model", + "test-model", + "--effort", + "high", + ], + ], +) +def test_migration_review_and_refine_reject_effort_flag(argv: list[str]) -> None: + parser = build_parser() + + with pytest.raises(SystemExit) as exc_info: + parser.parse_args(argv) + + assert exc_info.value.code == 2 + + def test_migration_list_includes_planning_ready_review_and_done_statuses( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, @@ -440,10 +468,10 @@ def test_migration_review_accepts_slug_or_path_inside_live_root( "target", awaiting_human_review=True, ) - seen: list[Path] = [] + seen: list[tuple[Path, str]] = [] def fake_review(request: object) -> None: - seen.append(request.target.path) + seen.append((request.target.path, request.effort)) monkeypatch.setattr( "continuous_refactoring.review_cli.handle_staged_migration_review", @@ -453,7 +481,7 @@ def fake_review(request: object) -> None: handle_migration_review(_review_args("target")) handle_migration_review(_review_args("migrations/target")) - assert seen == [migration_dir, migration_dir] + assert seen == [(migration_dir, "high"), (migration_dir, "high")] def test_migration_review_rejects_outside_path_and_symlink_escape( @@ -484,7 +512,7 @@ def test_migration_review_rejects_outside_path_and_symlink_escape( assert "symlink" in capsys.readouterr().err -def test_migration_review_rejects_missing_or_not_flagged_migration( +def test_migration_review_rejects_missing_migration_without_refine_suggestion( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str], @@ -496,13 +524,50 @@ def test_migration_review_rejects_missing_or_not_flagged_migration( handle_migration_review(_review_args("missing")) assert missing_exit.value.code == 2 - assert "does not exist" in capsys.readouterr().err + err = capsys.readouterr().err + assert "does not exist" in err + assert "migration refine" not in err + + +def test_migration_review_rejects_refine_eligible_not_awaiting_review_with_refine_hint( + tmp_path: Path, + monkeypatch: pytest.MonkeyPatch, + capsys: pytest.CaptureFixture[str], +) -> None: + _repo, live_dir = _init_migration_project(tmp_path, monkeypatch) + _write_migration(live_dir, "not-flagged") with pytest.raises(SystemExit) as not_flagged_exit: handle_migration_review(_review_args("not-flagged")) assert not_flagged_exit.value.code == 2 - assert "not flagged" in capsys.readouterr().err + err = capsys.readouterr().err + assert "not awaiting human review" in err + assert "continuous-refactoring migration list --awaiting-review" in err + assert "continuous-refactoring migration refine not-flagged" in err + + +def test_migration_review_rejects_non_refine_eligible_not_awaiting_review_with_guarded_refine_hint( + tmp_path: Path, + monkeypatch: pytest.MonkeyPatch, + capsys: pytest.CaptureFixture[str], +) -> None: + _repo, live_dir = _init_migration_project(tmp_path, monkeypatch) + _write_migration( + live_dir, + "phase-done", + phases=(replace(_PHASE, done=True),), + ) + + with pytest.raises(SystemExit) as not_flagged_exit: + handle_migration_review(_review_args("phase-done")) + + assert not_flagged_exit.value.code == 2 + err = capsys.readouterr().err + assert "not awaiting human review" in err + assert "continuous-refactoring migration list --awaiting-review" in err + assert "continuous-refactoring migration refine phase-done" in err + assert "only available for planning or unexecuted ready migrations" in err def test_migration_review_runs_agent_against_work_dir( @@ -523,6 +588,7 @@ def fake_interactive( agent: str, model: str, effort: str, prompt: str, repo_root: Path, ) -> int: seen["agent"] = agent + seen["effort"] = effort seen["cwd"] = repo_root seen["prompt"] = prompt manifest = load_manifest(repo_root / "manifest.json") @@ -544,6 +610,7 @@ def fake_interactive( handle_migration_review(_review_args("target")) assert seen["agent"] == "codex" + assert seen["effort"] == "high" assert seen["cwd"] != migration_dir assert isinstance(seen["cwd"], Path) assert seen["cwd"].name == "target" @@ -555,6 +622,50 @@ def fake_interactive( assert reloaded.human_review_reason is None +def test_migration_review_prompt_handles_missing_current_phase( + tmp_path: Path, + monkeypatch: pytest.MonkeyPatch, +) -> None: + repo, live_dir = _init_migration_project(tmp_path, monkeypatch) + migration_dir = _write_migration( + live_dir, + "target", + awaiting_human_review=True, + current_phase="", + human_review_reason="phase cursor cleared", + ) + _commit_all(repo) + seen: dict[str, str] = {} + + def fake_interactive( + agent: str, model: str, effort: str, prompt: str, repo_root: Path, + ) -> int: + seen["prompt"] = prompt + manifest = load_manifest(repo_root / "manifest.json") + save_manifest( + replace( + manifest, + awaiting_human_review=False, + current_phase="setup", + human_review_reason=None, + ), + repo_root / "manifest.json", + ) + return 0 + + monkeypatch.setattr( + "continuous_refactoring.review_cli.run_agent_interactive", + fake_interactive, + ) + + handle_migration_review(_review_args("target")) + + assert "phase cursor cleared" in seen["prompt"] + assert "Current phase file: (none)" in seen["prompt"] + assert "Current phase name: (none)" in seen["prompt"] + assert load_manifest(migration_dir / "manifest.json").current_phase == "setup" + + def test_migration_review_failure_leaves_live_snapshot_unchanged( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, @@ -747,6 +858,9 @@ def test_migration_refine_resumes_from_current_planning_state( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, ) -> None: + artifact_tmp = tmp_path / "artifact-tmp" + artifact_tmp.mkdir() + monkeypatch.setenv("TMPDIR", str(artifact_tmp)) repo, live_dir = _init_migration_project(tmp_path, monkeypatch) migration_dir = _write_migration( live_dir, "target", status="planning", current_phase="", phases=(), @@ -783,10 +897,18 @@ def test_migration_refine_resumes_from_current_planning_state( state = load_planning_state(repo, planning_state_path(migration_dir)) assert fake.stage_labels == ["expand"] + assert fake.efforts == ["high"] assert fake.mirror_to_terminal == [True] assert state.next_step == "review" + assert state.completed_steps[-1].effort == "high" assert state.feedback[-1].source == "message" assert state.feedback[-1].text == "split phase one" + summaries = list((artifact_tmp / "continuous-refactoring").glob("*/summary.json")) + assert len(summaries) == 1 + summary = json.loads(summaries[0].read_text(encoding="utf-8")) + assert summary["effort"] == "high" + assert summary["default_effort"] == "high" + assert summary["max_allowed_effort"] == "high" assert (migration_dir / "plan.md").read_text(encoding="utf-8") == "# Refined Plan\n" @@ -1392,7 +1514,6 @@ def _review_args(target: str) -> argparse.Namespace: target=target, agent="codex", model="test-model", - effort="low", ) @@ -1409,7 +1530,6 @@ def _refine_args( file=file, agent="codex", model="test-model", - effort="low", show_agent_logs=show_agent_logs, ) @@ -1442,6 +1562,7 @@ def __init__( self._index = 0 self._on_call = on_call self.stage_labels: list[str] = [] + self.efforts: list[str] = [] self.prompts: list[str] = [] self.mirror_to_terminal: list[bool] = [] @@ -1456,6 +1577,7 @@ def __call__(self, **kwargs: object) -> CommandCapture: self.prompts.append(prompt) self.stage_labels.append(stdout_path.parent.name) + self.efforts.append(str(kwargs["effort"])) self.mirror_to_terminal.append(bool(kwargs["mirror_to_terminal"])) for rel_path, content in writes.items(): path = migration_dir / rel_path diff --git a/tests/test_cli_review.py b/tests/test_cli_review.py index 78af49d..d3e6660 100644 --- a/tests/test_cli_review.py +++ b/tests/test_cli_review.py @@ -11,13 +11,11 @@ from continuous_refactoring.migrations import ( MigrationManifest, PhaseSpec, - load_manifest as load_migration_manifest, save_manifest as save_migration, ) from continuous_refactoring.review_cli import ( handle_review, handle_review_list, - handle_review_perform, ) _PHASES = ( @@ -31,7 +29,7 @@ ) -def test_review_parser_accepts_list_and_perform_subcommands() -> None: +def test_review_parser_accepts_list_and_rejects_perform_subcommand() -> None: parser = build_parser() review_args = parser.parse_args(["review"]) @@ -42,25 +40,9 @@ def test_review_parser_accepts_list_and_perform_subcommands() -> None: assert list_args.command == "review" assert list_args.review_command == "list" - perform_args = parser.parse_args( - [ - "review", - "perform", - "my-mig", - "--with", - "codex", - "--model", - "test-model", - "--effort", - "low", - ], - ) - assert perform_args.command == "review" - assert perform_args.review_command == "perform" - assert perform_args.migration == "my-mig" - assert perform_args.agent == "codex" - assert perform_args.model == "test-model" - assert perform_args.effort == "low" + with pytest.raises(SystemExit) as perform_exit: + parser.parse_args(["review", "perform", "my-mig"]) + assert perform_exit.value.code == 2 def test_parser_binds_handlers_for_top_level_commands() -> None: @@ -108,20 +90,6 @@ def _init_repo(path: Path) -> None: ) -def _make_perform_args(migration: str) -> argparse.Namespace: - return argparse.Namespace( - migration=migration, - agent="codex", - model="test-model", - effort="low", - ) - - -def _commit_all(repo: Path, message: str = "test state") -> None: - subprocess.run(["git", "add", "-A"], cwd=repo, check=True, capture_output=True) - subprocess.run(["git", "commit", "-m", message], cwd=repo, check=True, capture_output=True) - - def _init_review_project( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, ) -> tuple[Path, Path]: @@ -300,14 +268,6 @@ def test_review_list_ignores_hidden_and_transaction_dirs( ), "project not initialized", ), - ( - lambda: handle_review_perform(_make_perform_args("my-mig")), - 2, - lambda tmp_path, monkeypatch: _init_unconfigured_review_repo( - tmp_path, monkeypatch, - ), - "project not initialized", - ), ( handle_review_list, 1, @@ -316,14 +276,6 @@ def test_review_list_ignores_hidden_and_transaction_dirs( ), "live-migrations-dir", ), - ( - lambda: handle_review_perform(_make_perform_args("my-mig")), - 2, - lambda tmp_path, monkeypatch: register_project( - _init_unconfigured_review_repo(tmp_path, monkeypatch), - ), - "live-migrations-dir", - ), ( handle_review_list, 1, @@ -332,14 +284,6 @@ def test_review_list_ignores_hidden_and_transaction_dirs( ), "escapes repo", ), - ( - lambda: handle_review_perform(_make_perform_args("my-mig")), - 2, - lambda tmp_path, monkeypatch: _configure_escaped_live_dir( - _init_unconfigured_review_repo(tmp_path, monkeypatch), - ), - "escapes repo", - ), ], ) def test_review_commands_surface_shared_context_errors( @@ -366,238 +310,6 @@ def _configure_escaped_live_dir(repo: Path) -> None: set_live_migrations_dir(project.entry.uuid, "../elsewhere") -def _setup_review_project( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - *, - awaiting: bool = True, - current_phase: str = "review-target", - human_review_reason: str | None = None, -) -> tuple[Path, Path]: - repo, live_dir = _init_review_project(tmp_path, monkeypatch) - save_migration( - _make_manifest( - "my-mig", - awaiting_human_review=awaiting, - status="ready", - current_phase=current_phase, - human_review_reason=human_review_reason, - ), - live_dir / "my-mig" / "manifest.json", - ) - (live_dir / "my-mig" / "plan.md").write_text("# Plan\n", encoding="utf-8") - for phase in _PHASES: - (live_dir / "my-mig" / phase.file).write_text( - "# Phase\n\n" - "## Precondition\n\n" - "Ready.\n\n" - "## Definition of Done\n\n" - "Done.\n", - encoding="utf-8", - ) - return repo, live_dir - - -def test_review_perform_happy_path( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, -) -> None: - repo, live_dir = _setup_review_project( - tmp_path, monkeypatch, - awaiting=True, - human_review_reason="needs security audit", - ) - _commit_all(repo) - manifest_path = live_dir / "my-mig" / "manifest.json" - captured_prompt: dict[str, str] = {} - - def fake_interactive( - agent: str, model: str, effort: str, prompt: str, repo_root: Path, - ) -> int: - captured_prompt["prompt"] = prompt - captured_prompt["repo_root"] = str(repo_root) - manifest = load_migration_manifest(repo_root / "manifest.json") - from dataclasses import replace - updated = replace( - manifest, - awaiting_human_review=False, - human_review_reason=None, - ) - save_migration(updated, repo_root / "manifest.json") - return 0 - - monkeypatch.setattr( - "continuous_refactoring.review_cli.run_agent_interactive", fake_interactive, - ) - - handle_review_perform(_make_perform_args("my-mig")) - - assert "needs security audit" in captured_prompt["prompt"] - assert "phase-2-review-target.md" in captured_prompt["prompt"] - assert "Name: review-target" in captured_prompt["prompt"] - assert captured_prompt["repo_root"] != str(Path.cwd().resolve()) - assert captured_prompt["repo_root"].endswith("/work/my-mig") - - reloaded = load_migration_manifest(manifest_path) - assert reloaded.awaiting_human_review is False - assert reloaded.human_review_reason is None - - -def test_review_perform_happy_path_without_current_phase( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, -) -> None: - repo, live_dir = _setup_review_project( - tmp_path, monkeypatch, - awaiting=True, - current_phase="", - human_review_reason="phase cursor cleared", - ) - _commit_all(repo) - manifest_path = live_dir / "my-mig" / "manifest.json" - captured_prompt: dict[str, str] = {} - - def fake_interactive( - agent: str, model: str, effort: str, prompt: str, repo_root: Path, - ) -> int: - captured_prompt["prompt"] = prompt - manifest = load_migration_manifest(repo_root / "manifest.json") - from dataclasses import replace - updated = replace( - manifest, - awaiting_human_review=False, - current_phase="review-target", - human_review_reason=None, - ) - save_migration(updated, repo_root / "manifest.json") - return 0 - - monkeypatch.setattr( - "continuous_refactoring.review_cli.run_agent_interactive", fake_interactive, - ) - - handle_review_perform(_make_perform_args("my-mig")) - - assert "phase cursor cleared" in captured_prompt["prompt"] - assert "Current phase file: (none)" in captured_prompt["prompt"] - assert "Current phase name: (none)" in captured_prompt["prompt"] - - -def test_review_perform_exits_1_when_flag_not_cleared( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], -) -> None: - repo, live_dir = _setup_review_project(tmp_path, monkeypatch, awaiting=True) - _commit_all(repo) - - def fake_interactive( - agent: str, model: str, effort: str, prompt: str, repo_root: Path, - ) -> int: - return 0 - - monkeypatch.setattr( - "continuous_refactoring.review_cli.run_agent_interactive", fake_interactive, - ) - - with pytest.raises(SystemExit) as exc_info: - handle_review_perform(_make_perform_args("my-mig")) - - assert exc_info.value.code == 1 - err = capsys.readouterr().err - assert "not completed" in err - - -def test_review_perform_exits_with_agent_returncode( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], -) -> None: - repo, _live_dir = _setup_review_project(tmp_path, monkeypatch, awaiting=True) - _commit_all(repo) - - def fake_interactive( - agent: str, model: str, effort: str, prompt: str, repo_root: Path, - ) -> int: - return 7 - - monkeypatch.setattr( - "continuous_refactoring.review_cli.run_agent_interactive", fake_interactive, - ) - - with pytest.raises(SystemExit) as exc_info: - handle_review_perform(_make_perform_args("my-mig")) - - assert exc_info.value.code == 7 - err = capsys.readouterr().err - assert "review agent exited with code 7" in err - - -def test_review_perform_exits_2_when_migration_missing( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], -) -> None: - _init_review_project(tmp_path, monkeypatch) - - with pytest.raises(SystemExit) as exc_info: - handle_review_perform(_make_perform_args("nonexistent")) - - assert exc_info.value.code == 2 - err = capsys.readouterr().err - assert "does not exist" in err - - -def test_review_perform_exits_2_when_not_flagged_for_review( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], -) -> None: - repo, live_dir = _setup_review_project(tmp_path, monkeypatch, awaiting=False) - - with pytest.raises(SystemExit) as exc_info: - handle_review_perform(_make_perform_args("my-mig")) - - assert exc_info.value.code == 2 - err = capsys.readouterr().err - assert "not flagged" in err - - -def test_top_level_review_perform_routes_to_migration_review_compatibility_path( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, -) -> None: - _repo, live_dir = _setup_review_project( - tmp_path, - monkeypatch, - awaiting=True, - human_review_reason="needs approval", - ) - seen: dict[str, object] = {} - - def fake_staged_review(request: object) -> None: - seen["slug"] = request.target.slug - seen["path"] = request.target.path - seen["agent"] = request.agent - seen["model"] = request.model - seen["effort"] = request.effort - - monkeypatch.setattr( - "continuous_refactoring.review_cli.handle_staged_migration_review", - fake_staged_review, - ) - - handle_review_perform(_make_perform_args("my-mig")) - - assert seen == { - "slug": "my-mig", - "path": live_dir / "my-mig", - "agent": "codex", - "model": "test-model", - "effort": "low", - } - - def test_review_dispatches_list_subcommand( monkeypatch: pytest.MonkeyPatch, ) -> None: @@ -613,24 +325,6 @@ def test_review_dispatches_list_subcommand( assert seen == ["list"] -def test_review_dispatches_perform_subcommand( - monkeypatch: pytest.MonkeyPatch, -) -> None: - seen: list[str] = [] - - def fake_perform(args: argparse.Namespace) -> None: - seen.append(args.migration) - - monkeypatch.setattr( - "continuous_refactoring.review_cli.handle_review_perform", - fake_perform, - ) - - handle_review(argparse.Namespace(review_command="perform", migration="my-mig")) - - assert seen == ["my-mig"] - - def test_review_exits_2_without_subcommand( capsys: pytest.CaptureFixture[str], ) -> None: @@ -639,4 +333,4 @@ def test_review_exits_2_without_subcommand( assert exc_info.value.code == 2 err = capsys.readouterr().err - assert "Usage: continuous-refactoring review {list,perform}" in err + assert "Usage: continuous-refactoring review {list}" in err From 1b1f124cb88cdc0215e483ed9295ae6874625ad8 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 13:56:50 -0700 Subject: [PATCH 04/41] helptext and cleanup --- README.md | 2 +- src/continuous_refactoring/cli.py | 39 ++++++++--- src/continuous_refactoring/prompts.py | 16 ++--- src/continuous_refactoring/review_cli.py | 9 ++- tests/test_cli_migrations.py | 85 +++++++++++++++++++++--- tests/test_cli_review.py | 35 +++++++++- tests/test_continuous_refactoring.py | 2 +- tests/test_prompts.py | 10 +-- 8 files changed, 159 insertions(+), 39 deletions(-) diff --git a/README.md b/README.md index 610fa21..0a7b461 100644 --- a/README.md +++ b/README.md @@ -332,7 +332,7 @@ Before executing a phase, a ready-check agent verifies that the current phase pr - **ready: yes** — phase executes; on green tests, the phase is marked done, any prior deferral markers are cleared, and the migration advances immediately to the next phase. - **ready: no** — manifest activity is bumped, a retry cooldown is started, and a future `wake_up_on` is recorded when needed; the tick moves on. -- **ready: unverifiable** — the migration is flagged `awaiting_human_review` and put on cooldown. Automated migration ticks skip flagged migrations until review clears the flag. Use `migration list --awaiting-review` to find it and `migration review --with ... --model ...` to resolve it interactively. +- **ready: unverifiable** — the migration is marked `awaiting_human_review` and put on cooldown. Automated migration ticks skip migrations awaiting human review until review clears the flag. Use `migration list --awaiting-review` to find it and `migration review --with ... --model ...` to resolve it interactively. Human-facing migration references use the relative phase spec path, for example `phase-2-failure-report.md`. The manifest cursor stores the phase `name`, not a numeric index. diff --git a/src/continuous_refactoring/cli.py b/src/continuous_refactoring/cli.py index 5e744da..b73c7ea 100644 --- a/src/continuous_refactoring/cli.py +++ b/src/continuous_refactoring/cli.py @@ -156,6 +156,10 @@ def _add_taste_parser(subparsers: argparse._SubParsersAction) -> None: taste_parser = subparsers.add_parser( "taste", help="Manage refactoring taste files.", + description=( + "Manage project or global taste files. Agent-backed modes require " + "--with and --model." + ), ) taste_parser.set_defaults(handler=_handle_taste) taste_parser.add_argument( @@ -202,7 +206,8 @@ def _add_taste_parser(subparsers: argparse._SubParsersAction) -> None: def _add_run_once_parser(subparsers: argparse._SubParsersAction) -> None: run_once_parser = subparsers.add_parser( "run-once", - help="Single refactoring attempt (one agent call, no fix retry).", + help="Run one routed refactoring action without fix retry.", + description="Run one routed refactoring action without fix retry.", ) run_once_parser.set_defaults(handler=_handle_run_once) _add_common_args(run_once_parser) @@ -211,7 +216,8 @@ def _add_run_once_parser(subparsers: argparse._SubParsersAction) -> None: def _add_run_parser(subparsers: argparse._SubParsersAction) -> None: run_parser = subparsers.add_parser( "run", - help="Continuous refactoring loop with fix-prompt retry.", + help="Run routed refactoring actions with fix-prompt retry.", + description="Run routed refactoring actions with fix-prompt retry.", ) run_parser.set_defaults(handler=_handle_run) _add_common_args(run_parser) @@ -225,13 +231,14 @@ def _add_run_parser(subparsers: argparse._SubParsersAction) -> None: "--max-refactors", type=int, default=None, - help="Refactor actions to run.", + help="Actions to run.", ) run_parser.add_argument( "--focus-on-live-migrations", action="store_true", help=( - "Iterate only on live migrations until every one is done or blocked. " + "Iterate only on eligible live migrations until done, deferred, " + "blocked, or failure budget trips. " "Bypasses targeting and --max-refactors requirements." ), ) @@ -257,17 +264,26 @@ def _add_run_parser(subparsers: argparse._SubParsersAction) -> None: def _add_review_parser(subparsers: argparse._SubParsersAction) -> None: review_parser = subparsers.add_parser( "review", - help="Review migrations awaiting human review.", + help="Compatibility shortcut for migrations awaiting human review.", + description=( + "Compatibility listing shortcut for migrations awaiting human " + "review. Use `migration review` for canonical mutation." + ), ) review_parser.set_defaults(handler=handle_review) review_sub = review_parser.add_subparsers(dest="review_command") - review_sub.add_parser("list", help="List migrations flagged for review.") + review_sub.add_parser( + "list", + help="List migrations awaiting human review.", + description="List migrations awaiting human review.", + ) def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: migration_parser = subparsers.add_parser( "migration", - help="Inspect live migrations.", + help="Inspect and manage live migrations.", + description="Inspect and manage live migrations.", ) migration_parser.set_defaults(handler=handle_migration) migration_sub = migration_parser.add_subparsers(dest="migration_command") @@ -305,7 +321,11 @@ def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: review_parser = migration_sub.add_parser( "review", - help="Perform staged review on a flagged migration.", + help="Resolve a migration awaiting human review in a staged workspace.", + description=( + "Resolve a migration awaiting human review in a staged workspace. " + "Requires --with and --model." + ), ) review_parser.add_argument("target", help="Migration slug or contained path.") review_parser.add_argument( @@ -316,7 +336,8 @@ def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: refine_parser = migration_sub.add_parser( "refine", - help="Refine a planning migration with user feedback.", + help="Apply feedback to a planning or unexecuted ready migration.", + description="Apply feedback to a planning or unexecuted ready migration.", ) refine_parser.add_argument("target", help="Migration slug or contained path.") feedback_group = refine_parser.add_mutually_exclusive_group(required=True) diff --git a/src/continuous_refactoring/prompts.py b/src/continuous_refactoring/prompts.py index ec5a9a8..8152064 100644 --- a/src/continuous_refactoring/prompts.py +++ b/src/continuous_refactoring/prompts.py @@ -30,7 +30,7 @@ "compose_phase_execution_prompt", "compose_phase_ready_prompt", "compose_planning_prompt", - "compose_review_perform_prompt", + "compose_migration_review_prompt", "compose_scope_selection_prompt", "compose_taste_refine_prompt", "compose_taste_upgrade_prompt", @@ -839,9 +839,9 @@ def compose_phase_execution_prompt( return _join_sections(*sections) -REVIEW_PERFORM_PROMPT = """\ -You are conducting a human review of a refactoring migration that was flagged -for human input during planning. +MIGRATION_REVIEW_PROMPT = """\ +You are conducting a human review of a refactoring migration that was marked +as awaiting human review by the driver. Project-specific taste is injected by the caller in the `## Taste` section. @@ -859,8 +859,8 @@ def compose_phase_execution_prompt( what the plan assumes. Note any drift you find — stale assumptions change what is worth asking the user. 3. Present the situation to the user: what the migration does, what phase it is - on, and why it was flagged for review. The manifest's "Human review reason" - field (shown below) is the exact reason the driver flagged this migration — + on, and why it is awaiting human review. The manifest's "Human review reason" + field (shown below) is the exact reason the driver marked this migration — surface it verbatim so the user can see what triggered the hand-off. Include any drift you found so the user sees the current shape, not the shape the plan was written against. @@ -889,7 +889,7 @@ def compose_phase_execution_prompt( """ -def compose_review_perform_prompt( +def compose_migration_review_prompt( migration_name: str, repo_root: Path, work_dir: Path, @@ -900,7 +900,7 @@ def compose_review_perform_prompt( ) -> str: reason = manifest.human_review_reason or "(no reason recorded)" sections: list[str] = [ - REVIEW_PERFORM_PROMPT, + MIGRATION_REVIEW_PROMPT, f"## Migration\nName: {migration_name}", ( "## Workspace\n" diff --git a/src/continuous_refactoring/review_cli.py b/src/continuous_refactoring/review_cli.py index 1160a4d..42b6418 100644 --- a/src/continuous_refactoring/review_cli.py +++ b/src/continuous_refactoring/review_cli.py @@ -33,14 +33,13 @@ prepare_planning_workspace, publish_planning_workspace, ) -from continuous_refactoring.prompts import compose_review_perform_prompt +from continuous_refactoring.prompts import compose_migration_review_prompt __all__ = [ "StagedReviewRequest", "handle_review", "handle_review_list", "handle_staged_migration_review", - "perform_staged_migration_review", ] _REVIEW_USAGE = "Usage: continuous-refactoring review {list}" @@ -127,7 +126,7 @@ def handle_staged_migration_review( request: StagedReviewRequest, ) -> PlanningPublishResult: try: - return perform_staged_migration_review(request) + return _run_staged_migration_review(request) except _ReviewCliError as error: print(f"Error: {error}", file=sys.stderr) raise SystemExit(error.exit_code) from error @@ -142,7 +141,7 @@ def handle_staged_migration_review( raise SystemExit(1) from error -def perform_staged_migration_review( +def _run_staged_migration_review( request: StagedReviewRequest, ) -> PlanningPublishResult: manifest_path = request.target.path / "manifest.json" @@ -177,7 +176,7 @@ def perform_staged_migration_review( ) from error phase = resolve_current_phase(manifest) if manifest.current_phase else None - prompt = compose_review_perform_prompt( + prompt = compose_migration_review_prompt( request.target.slug, request.repo_root, workspace.root, diff --git a/tests/test_cli_migrations.py b/tests/test_cli_migrations.py index ec44201..50c37bc 100644 --- a/tests/test_cli_migrations.py +++ b/tests/test_cli_migrations.py @@ -84,6 +84,64 @@ def test_migration_parser_accepts_list_and_doctor() -> None: assert not hasattr(review_args, "effort") +def test_run_help_describes_routed_actions( + capsys: pytest.CaptureFixture[str], +) -> None: + parser = build_parser() + + top_help = _help_text(parser, ["--help"], capsys) + run_once_help = _help_text(parser, ["run-once", "--help"], capsys) + run_help = _help_text(parser, ["run", "--help"], capsys) + + assert "Run one routed refactoring action without fix retry." in top_help + assert "Run one routed refactoring action without fix retry." in run_once_help + assert "Run routed refactoring actions with fix-prompt retry." in top_help + assert "Run routed refactoring actions with fix-prompt retry." in run_help + assert "Actions to run." in run_help + assert "eligible live migrations until done" in run_help + assert "deferred" in run_help + assert "blocked" in run_help + assert "one agent call" not in top_help + assert "Refactor actions to run." not in run_help + + +def test_migration_help_describes_review_and_refine( + capsys: pytest.CaptureFixture[str], +) -> None: + parser = build_parser() + + top_help = _help_text(parser, ["--help"], capsys) + migration_help = _help_text(parser, ["migration", "--help"], capsys) + review_help = _help_text(parser, ["migration", "review", "--help"], capsys) + refine_help = _help_text(parser, ["migration", "refine", "--help"], capsys) + + assert "Inspect and manage live migrations." in top_help + assert "Inspect and manage live migrations." in migration_help + assert "Resolve a migration awaiting human review" in migration_help + assert "staged workspace" in migration_help + assert "Resolve a migration awaiting human review" in review_help + assert "staged workspace" in review_help + assert "Requires --with and --model." in review_help + assert "Apply feedback to a planning or unexecuted ready" in migration_help + assert "Apply feedback to a planning or unexecuted ready" in refine_help + assert "--effort" not in review_help + assert "--effort" not in refine_help + assert "Perform staged " + "review" not in migration_help + assert "flagged " + "migration" not in migration_help + + +def test_taste_help_describes_scope_and_agent_backed_modes( + capsys: pytest.CaptureFixture[str], +) -> None: + parser = build_parser() + + taste_help = _help_text(parser, ["taste", "--help"], capsys) + + assert "Manage project or global taste files." in taste_help + assert "Agent-backed modes require --with and --model." in taste_help + assert "--effort" not in taste_help + + def test_migration_parser_accepts_doctor_all() -> None: parser = build_parser() @@ -140,6 +198,17 @@ def _canonical_migration_commands(readme: str) -> tuple[str, ...]: ) +def _help_text( + parser: argparse.ArgumentParser, + argv: list[str], + capsys: pytest.CaptureFixture[str], +) -> str: + with pytest.raises(SystemExit) as exit_info: + parser.parse_args(argv) + assert exit_info.value.code == 0 + return " ".join(capsys.readouterr().out.split()) + + def _argv_from_documented_command(command: str) -> list[str]: parts = shlex.split(command) if parts[0] != "continuous-refactoring": @@ -518,7 +587,7 @@ def test_migration_review_rejects_missing_migration_without_refine_suggestion( capsys: pytest.CaptureFixture[str], ) -> None: _repo, live_dir = _init_migration_project(tmp_path, monkeypatch) - _write_migration(live_dir, "not-flagged") + _write_migration(live_dir, "not-awaiting-review") with pytest.raises(SystemExit) as missing_exit: handle_migration_review(_review_args("missing")) @@ -535,16 +604,16 @@ def test_migration_review_rejects_refine_eligible_not_awaiting_review_with_refin capsys: pytest.CaptureFixture[str], ) -> None: _repo, live_dir = _init_migration_project(tmp_path, monkeypatch) - _write_migration(live_dir, "not-flagged") + _write_migration(live_dir, "not-awaiting-review") - with pytest.raises(SystemExit) as not_flagged_exit: - handle_migration_review(_review_args("not-flagged")) + with pytest.raises(SystemExit) as not_awaiting_review_exit: + handle_migration_review(_review_args("not-awaiting-review")) - assert not_flagged_exit.value.code == 2 + assert not_awaiting_review_exit.value.code == 2 err = capsys.readouterr().err assert "not awaiting human review" in err assert "continuous-refactoring migration list --awaiting-review" in err - assert "continuous-refactoring migration refine not-flagged" in err + assert "continuous-refactoring migration refine not-awaiting-review" in err def test_migration_review_rejects_non_refine_eligible_not_awaiting_review_with_guarded_refine_hint( @@ -559,10 +628,10 @@ def test_migration_review_rejects_non_refine_eligible_not_awaiting_review_with_g phases=(replace(_PHASE, done=True),), ) - with pytest.raises(SystemExit) as not_flagged_exit: + with pytest.raises(SystemExit) as not_awaiting_review_exit: handle_migration_review(_review_args("phase-done")) - assert not_flagged_exit.value.code == 2 + assert not_awaiting_review_exit.value.code == 2 err = capsys.readouterr().err assert "not awaiting human review" in err assert "continuous-refactoring migration list --awaiting-review" in err diff --git a/tests/test_cli_review.py b/tests/test_cli_review.py index d3e6660..36342cd 100644 --- a/tests/test_cli_review.py +++ b/tests/test_cli_review.py @@ -29,7 +29,7 @@ ) -def test_review_parser_accepts_list_and_rejects_perform_subcommand() -> None: +def test_review_parser_accepts_list_and_rejects_unknown_subcommands() -> None: parser = build_parser() review_args = parser.parse_args(["review"]) @@ -45,6 +45,26 @@ def test_review_parser_accepts_list_and_rejects_perform_subcommand() -> None: assert perform_exit.value.code == 2 +def test_review_help_describes_compatibility_listing( + capsys: pytest.CaptureFixture[str], +) -> None: + parser = build_parser() + + top_help = _help_text(parser, ["--help"], capsys) + review_help = _help_text(parser, ["review", "--help"], capsys) + review_list_help = _help_text(parser, ["review", "list", "--help"], capsys) + + assert "Compatibility shortcut" in top_help + assert "migrations awaiting human review" in top_help + assert "Compatibility listing shortcut" in review_help + assert "Use `migration review` for canonical mutation." in review_help + assert "List migrations awaiting human review." in review_help + assert "List migrations awaiting human review." in review_list_help + stale_review_phrase = "flagged " + "for review" + assert stale_review_phrase not in top_help + assert stale_review_phrase not in review_help + + def test_parser_binds_handlers_for_top_level_commands() -> None: parser = build_parser() @@ -90,6 +110,17 @@ def _init_repo(path: Path) -> None: ) +def _help_text( + parser: argparse.ArgumentParser, + argv: list[str], + capsys: pytest.CaptureFixture[str], +) -> str: + with pytest.raises(SystemExit) as exit_info: + parser.parse_args(argv) + assert exit_info.value.code == 0 + return " ".join(capsys.readouterr().out.split()) + + def _init_review_project( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, ) -> tuple[Path, Path]: @@ -137,7 +168,7 @@ def _make_manifest( ) -def test_review_list_filters_flagged_migrations( +def test_review_list_filters_migrations_awaiting_human_review( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, capsys: pytest.CaptureFixture[str], diff --git a/tests/test_continuous_refactoring.py b/tests/test_continuous_refactoring.py index 005e7d4..085bfde 100644 --- a/tests/test_continuous_refactoring.py +++ b/tests/test_continuous_refactoring.py @@ -123,7 +123,7 @@ "compose_phase_execution_prompt", "compose_phase_ready_prompt", "compose_planning_prompt", - "compose_review_perform_prompt", + "compose_migration_review_prompt", "compose_scope_selection_prompt", "compose_taste_refine_prompt", "compose_taste_upgrade_prompt", diff --git a/tests/test_prompts.py b/tests/test_prompts.py index 8e9dc05..0e8d32f 100644 --- a/tests/test_prompts.py +++ b/tests/test_prompts.py @@ -24,19 +24,19 @@ DEFAULT_REFACTORING_PROMPT, PHASE_EXECUTION_PROMPT, PHASE_READY_CHECK_PROMPT, + MIGRATION_REVIEW_PROMPT, PLANNING_APPROACHES_PROMPT, PLANNING_EXPAND_PROMPT, PLANNING_FINAL_REVIEW_PROMPT, PLANNING_PICK_BEST_PROMPT, PLANNING_REVIEW_PROMPT, - REVIEW_PERFORM_PROMPT, compose_full_prompt, compose_classifier_prompt, compose_interview_prompt, + compose_migration_review_prompt, compose_phase_execution_prompt, compose_phase_ready_prompt, compose_planning_prompt, - compose_review_perform_prompt, compose_taste_refine_prompt, compose_taste_upgrade_prompt, ) @@ -75,7 +75,7 @@ PLANNING_FINAL_REVIEW_PROMPT, PHASE_READY_CHECK_PROMPT, PHASE_EXECUTION_PROMPT, - REVIEW_PERFORM_PROMPT, + MIGRATION_REVIEW_PROMPT, ) @@ -262,7 +262,7 @@ def test_review_prompt_names_work_dir_and_forbids_live_dir_mutation() -> None: work_dir = Path("/xdg/projects/p/planning/auth-cleanup/review-1/work/auth-cleanup") live_dir = Path("/repo/migrations/auth-cleanup") - result = compose_review_perform_prompt( + result = compose_migration_review_prompt( "auth-cleanup", repo_root, work_dir, @@ -327,7 +327,7 @@ def test_review_and_refine_prompts_forbid_live_dir_mutation(tmp_path: Path) -> N repo_root = tmp_path / "repo" review_work_dir = tmp_path / "xdg" / "planning" / "auth-cleanup" / "review" / "work" live_mig_root = repo_root / "migrations" / "auth-cleanup" - review_prompt = compose_review_perform_prompt( + review_prompt = compose_migration_review_prompt( "auth-cleanup", repo_root, review_work_dir, From b57dee520ab4a43147a61325375b9035ad40790c Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:07:29 -0700 Subject: [PATCH 05/41] remove the legacy command --- AGENTS.md | 5 +- README.md | 2 - src/continuous_refactoring/cli.py | 20 -- src/continuous_refactoring/review_cli.py | 77 ----- tests/test_cli_migrations.py | 10 + tests/test_cli_review.py | 367 ----------------------- tests/test_cli_taste_warning.py | 4 +- 7 files changed, 14 insertions(+), 471 deletions(-) delete mode 100644 tests/test_cli_review.py diff --git a/AGENTS.md b/AGENTS.md index a62cf41..efc255a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -44,7 +44,6 @@ Treat `AGENTS.md` as part of the codebase's invariants, not documentation. A dri `continuous-refactoring migration doctor --all` - Review migrations: `continuous-refactoring migration review --with codex|claude --model ` - (`review list` remains a compatibility shortcut for awaiting-review listing) - Refine migration planning: `continuous-refactoring migration refine (--message |--file ) --with codex|claude --model [--show-agent-logs]` @@ -127,14 +126,14 @@ active phase explicitly names `loop.py` in scope. - **Driver owns commits** (`refactor_attempts.py:_finalize_commit()`, called from `loop.py`) — if an agent commits mid-attempt, driver does `git reset --soft head_before` and re-commits with its own message. - **Migration scheduling split** (`migrations.py`, `loop.py`, `phases.py`) — `last_touch` is activity bookkeeping, not the 6-hour retry gate. Deferred/blocked migrations set `cooldown_until`; successful phase completion clears deferral markers so the next ready phase can run immediately. - **Migration tick deferral writes** (`migration_tick.py`) — ready-check deferrals are queued while scanning candidates and saved only when the tick finds no executable phase or blocks for human review. Do not save a deferred manifest before checking later candidates; that dirties the worktree and can make ready-checks reject runnable phases. -- **Migration visibility + consistency gate** (`migration_consistency.py`, `migration_tick.py`, `loop.py`, `review_cli.py`) — candidate scans use `iter_visible_migration_dirs()` so hidden/dotted/internal/symlink dirs are invisible to tick/review list. Before ready-check, `execution-gate` consistency errors block phase execution; `info`/`warning` never block. +- **Migration visibility + consistency gate** (`migration_consistency.py`, `migration_tick.py`, `loop.py`, `review_cli.py`) — candidate scans use `iter_visible_migration_dirs()` so hidden/dotted/internal/symlink dirs are invisible to tick/list/review commands. Before ready-check, `execution-gate` consistency errors block phase execution; `info`/`warning` never block. - **Manifest codec boundary** (`migration_manifest_codec.py`, `migrations.py`) — codec owns legacy `ready_when`, legacy integer `current_phase`, duplicate phase-name rejection, and saved JSON formatting. `load_manifest()` / `save_manifest()` own filesystem and JSON boundary errors. - **Planning state codec boundary** (`planning_state.py`, `planning.py`) — `.planning/state.json` is valid only when completed steps replay through the branching planning graph to `next_step`; recorded outputs must be repo-relative files inside the migration directory. User refinement feedback is durable state, and append-only `revision_base_step_counts` anchors let unexecuted ready migrations reuse `revise` after terminal ready decisions; legacy `revision_base_step_count` decodes as one anchor. Persist accepted step stdout after the step is validated; do not add durable fields for failed current-step output. - **Planning publish transaction** (`planning_publish.py`) — publish copies the complete workspace snapshot to `__transactions__//staged`, validates it, checks same-device and `base_snapshot_id`, moves live to `rollback`, moves staged live, validates live, then deletes rollback. On post-rollback failure, move bad live to `failed` before restoring rollback. Transaction directories are invisible to scheduling/list candidates but visible to `migration doctor --all`. Do not bypass the lock or dirty-live check. - **One-step planning engine** (`planning.py`) — product planning entry points call `run_next_planning_step()` so one action runs exactly `PlanningState.next_step`, records accepted stdout/state in an off-live workspace, and publishes through `planning_publish.py`. Failed current-step output is never durable resume input. `run_planning` is intentionally not package-exported. - **Planning resume scheduling** (`migration_tick.py`, `loop.py`, `routing_pipeline.py`) — normal automation runs one eligible `status: planning` step before ready/in-progress phase ticks and before source-target routing. Missing or invalid `.planning/state.json` blocks automation with planning failure evidence; `status: planning` must never enter phase ready-check or phase execution. - **Focused planning reselection** (`loop.py`, `migration_tick.py`) — focused mode tracks planning migrations abandoned with `new-target` only in memory for the current run, skips them while another planning or phase candidate is eligible, and retries them only when no alternative remains. Do not persist this as `cooldown_until`; planning step failure is not a durable readiness deferral. -- **Review CLI boundary** (`cli.py`, `review_cli.py`) — `cli.py` owns parser wiring; staged migration review internals live in `review_cli.py`, publish only through `planning_publish.py`, and stay internal/out of package-root `_SUBMODULES`. Top-level `review list` is only a compatibility listing shortcut; canonical review mutation is `migration review`. +- **Review CLI boundary** (`cli.py`, `review_cli.py`) — `cli.py` owns parser wiring; staged migration review internals live in `review_cli.py`, publish only through `planning_publish.py`, and stay internal/out of package-root `_SUBMODULES`. Review mutation is only exposed through `migration review`. - **Migration CLI boundary** (`cli.py`, `migration_cli.py`) — `cli.py` owns parser wiring only; `migration_cli.py` owns namespace dispatch, read-only list/doctor behavior, and the contained slug/path resolver used by mutation commands. Mutating subcommands delegate their internals to focused modules such as `review_cli.py` or the planning refine entry point. Resolver targets must stay direct visible children of the configured live migrations root and reject symlink, outside, parent-traversal, and ambiguous paths. - **Human-review gating** (`planning.py`, `migration_tick.py`, `review_cli.py`) — migrations with `awaiting_human_review=true` must be invisible to automated migration ticks/ready-checks until canonical `migration review` clears the flag through staged publish. `migration refine` may reopen an unexecuted ready migration to planning, but it is user feedback, not review approval. - **Migration terminology split** (`migrations.py`, `planning.py`, `prompts.py`) — manifest `precondition` gates phase start; phase markdown `## Definition of Done` governs completion. diff --git a/README.md b/README.md index 0a7b461..a4b6e52 100644 --- a/README.md +++ b/README.md @@ -140,8 +140,6 @@ always run at fixed `medium` effort. | `migration review ` | Starts staged review for a migration awaiting human review. Requires `--with` and `--model`; review runs at fixed internal `high` effort. | | `migration refine ` | Records feedback for a planning or unexecuted ready migration and runs one staged planning revision. Requires `--message ` or `--file `, plus `--with` and `--model`; refine runs at fixed internal `high` effort. Add `--show-agent-logs` to mirror the planning agent. | -Legacy `review list` remains a compatibility shortcut for `migration list --awaiting-review`. - ## Targeting / Useful flags ### Target selection diff --git a/src/continuous_refactoring/cli.py b/src/continuous_refactoring/cli.py index b73c7ea..1f8a130 100644 --- a/src/continuous_refactoring/cli.py +++ b/src/continuous_refactoring/cli.py @@ -32,7 +32,6 @@ ) from continuous_refactoring.migration_cli import handle_migration from continuous_refactoring.migrations import MIGRATION_STATUSES -from continuous_refactoring.review_cli import handle_review _PACKAGE_DISTRIBUTION = "continuous-refactoring" _TASTE_WARNING = "warning: taste out of date — run `continuous-refactoring taste --upgrade`" @@ -261,24 +260,6 @@ def _add_run_parser(subparsers: argparse._SubParsersAction) -> None: ) -def _add_review_parser(subparsers: argparse._SubParsersAction) -> None: - review_parser = subparsers.add_parser( - "review", - help="Compatibility shortcut for migrations awaiting human review.", - description=( - "Compatibility listing shortcut for migrations awaiting human " - "review. Use `migration review` for canonical mutation." - ), - ) - review_parser.set_defaults(handler=handle_review) - review_sub = review_parser.add_subparsers(dest="review_command") - review_sub.add_parser( - "list", - help="List migrations awaiting human review.", - description="List migrations awaiting human review.", - ) - - def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: migration_parser = subparsers.add_parser( "migration", @@ -380,7 +361,6 @@ def build_parser() -> argparse.ArgumentParser: ) upgrade_parser.set_defaults(handler=_handle_upgrade) _add_migration_parser(subparsers) - _add_review_parser(subparsers) return parser diff --git a/src/continuous_refactoring/review_cli.py b/src/continuous_refactoring/review_cli.py index 42b6418..7ded76f 100644 --- a/src/continuous_refactoring/review_cli.py +++ b/src/continuous_refactoring/review_cli.py @@ -1,6 +1,5 @@ from __future__ import annotations -import argparse import shutil import sys import uuid @@ -9,20 +8,14 @@ from continuous_refactoring.agent import run_agent_interactive from continuous_refactoring.artifacts import ContinuousRefactorError -from continuous_refactoring.config import ( - resolve_live_migrations_dir, - resolve_project, -) from continuous_refactoring.migration_cli import MigrationTarget from continuous_refactoring.migration_consistency import ( check_migration_consistency, has_blocking_consistency_findings, - iter_visible_migration_dirs, ) from continuous_refactoring.migrations import ( MigrationManifest, load_manifest as load_migration_manifest, - phase_file_reference, resolve_current_phase, ) from continuous_refactoring.planning_publish import ( @@ -37,20 +30,9 @@ __all__ = [ "StagedReviewRequest", - "handle_review", - "handle_review_list", "handle_staged_migration_review", ] -_REVIEW_USAGE = "Usage: continuous-refactoring review {list}" - - -@dataclass(frozen=True) -class _ReviewCliContext: - repo_root: Path - live_dir: Path - project_state_dir: Path - @dataclass(frozen=True) class StagedReviewRequest: @@ -70,58 +52,6 @@ def __init__(self, message: str, exit_code: int) -> None: super().__init__(message) -def _resolve_review_context(*, error_code: int) -> _ReviewCliContext: - try: - project = resolve_project(Path.cwd().resolve()) - except ContinuousRefactorError: - print( - "Error: project not initialized; no live-migrations-dir available.", - file=sys.stderr, - ) - raise SystemExit(error_code) - try: - live_dir = resolve_live_migrations_dir(project) - except ContinuousRefactorError as error: - print(f"Error: {error}", file=sys.stderr) - raise SystemExit(error_code) - if live_dir is None: - print( - "Error: no live-migrations-dir configured for this project.", - file=sys.stderr, - ) - raise SystemExit(error_code) - - return _ReviewCliContext( - repo_root=Path(project.entry.path).resolve(), - live_dir=live_dir, - project_state_dir=project.project_dir, - ) - - -def handle_review_list() -> None: - context = _resolve_review_context(error_code=1) - live_dir = context.live_dir - - if not live_dir.is_dir(): - return - - for child in iter_visible_migration_dirs(live_dir): - manifest_file = child / "manifest.json" - if not manifest_file.exists(): - continue - manifest = load_migration_manifest(manifest_file) - if manifest.awaiting_human_review: - reason = manifest.human_review_reason or "(no reason recorded)" - phase = resolve_current_phase(manifest) if manifest.current_phase else None - phase_file = phase_file_reference(phase) if phase is not None else "(none)" - phase_name = phase.name if phase is not None else "(none)" - print( - f"{manifest.name}\t{manifest.status}\t" - f"{phase_file}\t{phase_name}\t{manifest.last_touch}\t" - f"{reason}" - ) - - def handle_staged_migration_review( request: StagedReviewRequest, ) -> PlanningPublishResult: @@ -280,10 +210,3 @@ def _review_publish_error_message(error: PlanningPublishError, slug: str) -> str f"Run `continuous-refactoring migration doctor {slug}` if unsure, then " f"rerun `continuous-refactoring migration review {slug} ...`." ) - - -def handle_review(args: argparse.Namespace) -> None: - if args.review_command == "list": - return handle_review_list() - print(_REVIEW_USAGE, file=sys.stderr) - raise SystemExit(2) diff --git a/tests/test_cli_migrations.py b/tests/test_cli_migrations.py index 50c37bc..cc390be 100644 --- a/tests/test_cli_migrations.py +++ b/tests/test_cli_migrations.py @@ -84,6 +84,15 @@ def test_migration_parser_accepts_list_and_doctor() -> None: assert not hasattr(review_args, "effort") +def test_top_level_review_command_is_not_registered() -> None: + parser = build_parser() + + with pytest.raises(SystemExit) as exit_info: + parser.parse_args(["review"]) + + assert exit_info.value.code == 2 + + def test_run_help_describes_routed_actions( capsys: pytest.CaptureFixture[str], ) -> None: @@ -126,6 +135,7 @@ def test_migration_help_describes_review_and_refine( assert "Apply feedback to a planning or unexecuted ready" in refine_help assert "--effort" not in review_help assert "--effort" not in refine_help + assert "Compatibility shortcut" not in top_help assert "Perform staged " + "review" not in migration_help assert "flagged " + "migration" not in migration_help diff --git a/tests/test_cli_review.py b/tests/test_cli_review.py deleted file mode 100644 index 36342cd..0000000 --- a/tests/test_cli_review.py +++ /dev/null @@ -1,367 +0,0 @@ -from __future__ import annotations - -import argparse -import subprocess -from pathlib import Path - -import pytest - -from continuous_refactoring.cli import build_parser -from continuous_refactoring.config import register_project, set_live_migrations_dir -from continuous_refactoring.migrations import ( - MigrationManifest, - PhaseSpec, - save_manifest as save_migration, -) -from continuous_refactoring.review_cli import ( - handle_review, - handle_review_list, -) - -_PHASES = ( - PhaseSpec(name="setup", file="phase-1-setup.md", done=True, precondition="always"), - PhaseSpec( - name="review-target", - file="phase-2-review-target.md", - done=False, - precondition="setup complete", - ), -) - - -def test_review_parser_accepts_list_and_rejects_unknown_subcommands() -> None: - parser = build_parser() - - review_args = parser.parse_args(["review"]) - assert review_args.command == "review" - assert review_args.review_command is None - - list_args = parser.parse_args(["review", "list"]) - assert list_args.command == "review" - assert list_args.review_command == "list" - - with pytest.raises(SystemExit) as perform_exit: - parser.parse_args(["review", "perform", "my-mig"]) - assert perform_exit.value.code == 2 - - -def test_review_help_describes_compatibility_listing( - capsys: pytest.CaptureFixture[str], -) -> None: - parser = build_parser() - - top_help = _help_text(parser, ["--help"], capsys) - review_help = _help_text(parser, ["review", "--help"], capsys) - review_list_help = _help_text(parser, ["review", "list", "--help"], capsys) - - assert "Compatibility shortcut" in top_help - assert "migrations awaiting human review" in top_help - assert "Compatibility listing shortcut" in review_help - assert "Use `migration review` for canonical mutation." in review_help - assert "List migrations awaiting human review." in review_help - assert "List migrations awaiting human review." in review_list_help - stale_review_phrase = "flagged " + "for review" - assert stale_review_phrase not in top_help - assert stale_review_phrase not in review_help - - -def test_parser_binds_handlers_for_top_level_commands() -> None: - parser = build_parser() - - assert parser.parse_args(["init"]).handler.__name__ == "_handle_init" - assert parser.parse_args(["taste"]).handler.__name__ == "_handle_taste" - assert parser.parse_args(["upgrade"]).handler.__name__ == "_handle_upgrade" - assert parser.parse_args(["review"]).handler.__name__ == "handle_review" - assert parser.parse_args( - ["run-once", "--with", "codex", "--model", "m", "--scope-instruction", "s"] - ).handler.__name__ == "_handle_run_once" - assert parser.parse_args( - [ - "run", - "--with", - "codex", - "--model", - "m", - "--scope-instruction", - "s", - "--max-refactors", - "1", - ] - ).handler.__name__ == "_handle_run" - - -def _init_repo(path: Path) -> None: - path.mkdir(parents=True, exist_ok=True) - subprocess.run(["git", "init"], cwd=path, check=True, capture_output=True) - subprocess.run( - ["git", "config", "user.email", "test@example.com"], - cwd=path, check=True, capture_output=True, - ) - subprocess.run( - ["git", "config", "user.name", "Test User"], - cwd=path, check=True, capture_output=True, - ) - (path / "README.md").write_text("seed\n", encoding="utf-8") - subprocess.run( - ["git", "add", "README.md"], cwd=path, check=True, capture_output=True, - ) - subprocess.run( - ["git", "commit", "-m", "init"], cwd=path, check=True, capture_output=True, - ) - - -def _help_text( - parser: argparse.ArgumentParser, - argv: list[str], - capsys: pytest.CaptureFixture[str], -) -> str: - with pytest.raises(SystemExit) as exit_info: - parser.parse_args(argv) - assert exit_info.value.code == 0 - return " ".join(capsys.readouterr().out.split()) - - -def _init_review_project( - tmp_path: Path, monkeypatch: pytest.MonkeyPatch, -) -> tuple[Path, Path]: - monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path / "xdg")) - repo = tmp_path / "project" - _init_repo(repo) - monkeypatch.chdir(repo) - - project = register_project(repo) - live_dir = repo / ".migrations" - live_dir.mkdir() - set_live_migrations_dir(project.entry.uuid, ".migrations") - return repo, live_dir - - -def _init_unconfigured_review_repo( - tmp_path: Path, monkeypatch: pytest.MonkeyPatch, -) -> Path: - monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path / "xdg")) - repo = tmp_path / "project" - _init_repo(repo) - monkeypatch.chdir(repo) - return repo - - -def _make_manifest( - name: str, - *, - awaiting_human_review: bool = False, - status: str = "ready", - current_phase: str = "review-target", - last_touch: str = "2025-01-01T00:00:00+00:00", - human_review_reason: str | None = None, -) -> MigrationManifest: - return MigrationManifest( - name=name, - created_at="2025-01-01T00:00:00+00:00", - last_touch=last_touch, - wake_up_on=None, - awaiting_human_review=awaiting_human_review, - status=status, - current_phase=current_phase, - phases=_PHASES, - human_review_reason=human_review_reason, - ) - - -def test_review_list_filters_migrations_awaiting_human_review( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], -) -> None: - _, live_dir = _init_review_project(tmp_path, monkeypatch) - - save_migration( - _make_manifest( - "listed-a", - awaiting_human_review=True, - status="in-progress", - current_phase="setup", - last_touch="2025-03-02T14:00:00+00:00", - ), - live_dir / "mig-b" / "manifest.json", - ) - save_migration( - _make_manifest( - "listed-without-phase", - awaiting_human_review=True, - status="ready", - current_phase="", - last_touch="2025-03-03T16:00:00+00:00", - human_review_reason="phase cursor cleared", - ), - live_dir / "mig-no-phase" / "manifest.json", - ) - save_migration( - _make_manifest("mig-c", awaiting_human_review=False, status="done"), - live_dir / "mig-c" / "manifest.json", - ) - save_migration( - _make_manifest( - "listed-z", - awaiting_human_review=True, - status="ready", - current_phase="review-target", - last_touch="2025-03-01T12:00:00+00:00", - human_review_reason="needs security audit", - ), - live_dir / "mig-a" / "manifest.json", - ) - - handle_review_list() - - out = capsys.readouterr().out - lines = [line for line in out.strip().splitlines() if line] - assert len(lines) == 3 - - fields_a = lines[0].split("\t") - assert fields_a == [ - "listed-z", - "ready", - "phase-2-review-target.md", - "review-target", - "2025-03-01T12:00:00+00:00", - "needs security audit", - ] - - fields_b = lines[1].split("\t") - assert fields_b == [ - "listed-a", - "in-progress", - "phase-1-setup.md", - "setup", - "2025-03-02T14:00:00+00:00", - "(no reason recorded)", - ] - - fields_no_phase = lines[2].split("\t") - assert fields_no_phase == [ - "listed-without-phase", - "ready", - "(none)", - "(none)", - "2025-03-03T16:00:00+00:00", - "phase cursor cleared", - ] - - -def test_review_list_ignores_hidden_and_transaction_dirs( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], -) -> None: - _, live_dir = _init_review_project(tmp_path, monkeypatch) - save_migration( - _make_manifest( - "visible-review", - awaiting_human_review=True, - human_review_reason="visible", - ), - live_dir / "visible-review" / "manifest.json", - ) - save_migration( - _make_manifest( - "hidden-review", - awaiting_human_review=True, - human_review_reason="hidden", - ), - live_dir / ".hidden-review" / "manifest.json", - ) - save_migration( - _make_manifest( - "transaction-review", - awaiting_human_review=True, - human_review_reason="transaction", - ), - live_dir / "__transactions__" / "manifest.json", - ) - - handle_review_list() - - out = capsys.readouterr().out - assert "visible-review\tready" in out - assert "hidden-review" not in out - assert "transaction-review" not in out - - -@pytest.mark.parametrize( - ("handler", "error_code", "setup", "expected_message"), - [ - ( - handle_review_list, - 1, - lambda tmp_path, monkeypatch: _init_unconfigured_review_repo( - tmp_path, monkeypatch, - ), - "project not initialized", - ), - ( - handle_review_list, - 1, - lambda tmp_path, monkeypatch: register_project( - _init_unconfigured_review_repo(tmp_path, monkeypatch), - ), - "live-migrations-dir", - ), - ( - handle_review_list, - 1, - lambda tmp_path, monkeypatch: _configure_escaped_live_dir( - _init_unconfigured_review_repo(tmp_path, monkeypatch), - ), - "escapes repo", - ), - ], -) -def test_review_commands_surface_shared_context_errors( - tmp_path: Path, - monkeypatch: pytest.MonkeyPatch, - capsys: pytest.CaptureFixture[str], - handler: object, - error_code: int, - setup: object, - expected_message: str, -) -> None: - setup(tmp_path, monkeypatch) - - with pytest.raises(SystemExit) as exc_info: - handler() - - assert exc_info.value.code == error_code - err = capsys.readouterr().err - assert expected_message in err - - -def _configure_escaped_live_dir(repo: Path) -> None: - project = register_project(repo) - set_live_migrations_dir(project.entry.uuid, "../elsewhere") - - -def test_review_dispatches_list_subcommand( - monkeypatch: pytest.MonkeyPatch, -) -> None: - seen: list[str] = [] - - monkeypatch.setattr( - "continuous_refactoring.review_cli.handle_review_list", - lambda: seen.append("list"), - ) - - handle_review(argparse.Namespace(review_command="list")) - - assert seen == ["list"] - - -def test_review_exits_2_without_subcommand( - capsys: pytest.CaptureFixture[str], -) -> None: - with pytest.raises(SystemExit) as exc_info: - handle_review(argparse.Namespace(review_command=None)) - - assert exc_info.value.code == 2 - err = capsys.readouterr().err - assert "Usage: continuous-refactoring review {list}" in err diff --git a/tests/test_cli_taste_warning.py b/tests/test_cli_taste_warning.py index e14986a..458c14e 100644 --- a/tests/test_cli_taste_warning.py +++ b/tests/test_cli_taste_warning.py @@ -52,7 +52,7 @@ def _register_repo_with_taste( (["cr", "init"], "_handle_init"), (["cr", "taste", "--global"], "_handle_taste"), (["cr", "upgrade"], "_handle_upgrade"), - (["cr", "review", "list"], "handle_review"), + (["cr", "migration", "list"], "handle_migration"), ( [ "cr", "run-once", @@ -104,7 +104,7 @@ def xdg_root( @pytest.mark.parametrize( "argv,handler_name", _SUBCOMMANDS, - ids=["init", "taste", "upgrade", "review", "run-once", "run"], + ids=["init", "taste", "upgrade", "migration", "run-once", "run"], ) @pytest.mark.parametrize( "taste_writer,warns", From a08273a13cf86d4557b768f8402b396992301477 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:10:09 -0700 Subject: [PATCH 06/41] continuous refactor: plan random-files-20260521T140910 Why: planning.approaches accepted; next step: pick-best --- .../.planning/stages/approaches.stdout.md | 7 +++ .../.planning/state.json | 25 ++++++++++ .../approaches/effort-engine-consolidation.md | 48 ++++++++++++++++++ .../approaches/interface-first-hardening.md | 49 +++++++++++++++++++ .../approaches/test-first-boundary-pruning.md | 47 ++++++++++++++++++ .../manifest.json | 12 +++++ 6 files changed, 188 insertions(+) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md create mode 100644 migrations/random-files-20260521T140910/.planning/state.json create mode 100644 migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md create mode 100644 migrations/random-files-20260521T140910/approaches/interface-first-hardening.md create mode 100644 migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md create mode 100644 migrations/random-files-20260521T140910/manifest.json diff --git a/migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md new file mode 100644 index 0000000..7527bb8 --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md @@ -0,0 +1,7 @@ +Created 3 approach files in the staged migration workspace: + +- `/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T140910/20260521T140750-956576-5c67127a69fe45e682e56b790053ca95/work/random-files-20260521T140910/approaches/interface-first-hardening.md` +- `/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T140910/20260521T140750-956576-5c67127a69fe45e682e56b790053ca95/work/random-files-20260521T140910/approaches/effort-engine-consolidation.md` +- `/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T140910/20260521T140750-956576-5c67127a69fe45e682e56b790053ca95/work/random-files-20260521T140910/approaches/test-first-boundary-pruning.md` + +Each includes strategy, tradeoffs, phased plan with `required_effort`, and risk profile, aligned to the injected taste and targeted files. diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json new file mode 100644 index 0000000..26237b3 --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -0,0 +1,25 @@ +{ + "completed_steps": [ + { + "agent": "codex", + "completed_at": "2026-05-21T14:10:09.546-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "approaches", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md" + } + } + ], + "feedback": [], + "final_decision": null, + "final_reason": null, + "next_step": "pick-best", + "review_findings": null, + "revision_base_step_counts": [], + "schema_version": 1, + "started_at": "2026-05-21T14:09:10.762-07:00", + "target": "random files", + "updated_at": "2026-05-21T14:10:09.546-07:00" +} diff --git a/migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md b/migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md new file mode 100644 index 0000000..73431de --- /dev/null +++ b/migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md @@ -0,0 +1,48 @@ +# Effort Engine Consolidation + +## Strategy +Make `effort.py` the clear single source of truth for effort resolution rules, then prune duplicate intent in callers/tests. Drive refactor from pure-function invariants. + +## Why this path +- Best when current pain is cognitive load around effort tiers/capping and phase-required behavior. +- Aligns with taste: small abstractions, strong boundaries, delete stale paths. + +## Tradeoffs +- Pros: Cleaner model, easier future migration/phase scheduling changes. +- Cons: Medium chance of subtle behavioral drift if invariants are incomplete. + +## Estimated phases + +### Phase 1: Encode invariants as tests +- Scope: `tests/test_loop_migration_tick.py` (+ adjacent effort/migration tests if needed) +- Work: + - Add matrix-style assertions for default/requested/required/capped combinations. + - Verify deferred phase behavior when `required_effort` exceeds run cap. +- required_effort: `medium` + +### Phase 2: Consolidate effort resolution code paths +- Scope: `src/continuous_refactoring/effort.py` +- Work: + - Unify resolution construction paths around one internal normalization flow. + - Keep exported API stable (`EffortBudget`, `EffortResolution`, helpers). + - Preserve module-boundary error translation style. +- required_effort: `medium` + +### Phase 3: Prune callsite complexity and dead checks +- Scope: `tests/test_prompts.py`, `src/continuous_refactoring/__main__.py` +- Work: + - Remove stale assertions/workarounds that duplicate `effort.py` guarantees. + - Keep only load-bearing contract tests. +- required_effort: `low` + +## Risk profile +- Overall: **Medium** +- Main risks: + - Regressing cap semantics in edge combinations. + - Over-pruning tests that guard behavior indirectly. +- Mitigations: + - Build exhaustive tier-order checks first. + - Keep API and error messages stable unless explicitly reviewed. + +## Best fit conditions +Pick this if maintainability of effort logic is the dominant objective. diff --git a/migrations/random-files-20260521T140910/approaches/interface-first-hardening.md b/migrations/random-files-20260521T140910/approaches/interface-first-hardening.md new file mode 100644 index 0000000..2bc8579 --- /dev/null +++ b/migrations/random-files-20260521T140910/approaches/interface-first-hardening.md @@ -0,0 +1,49 @@ +# Interface-First Hardening + +## Strategy +Stabilize and clarify externally visible behavior first, then tighten internals with tests guarding contracts. Focus on behavior that users feel: CLI effort semantics, prompt contract strings, and PR-title policy. + +## Why this path +- Best when regression risk at boundaries is the primary concern. +- Aligns with taste: preserve compatibility for shipped interfaces and surface behavior changes explicitly for human review. + +## Tradeoffs +- Pros: Lowest risk of accidental CLI/workflow breakage; strong confidence from boundary tests. +- Cons: Some internal cleanup is deferred; may keep minor internal duplication for now. + +## Estimated phases + +### Phase 1: Lock boundary behavior with targeted tests +- Scope: `tests/test_prompts.py`, `tests/test_loop_migration_tick.py`, `.github/workflows/pr-title.yml` +- Work: + - Add/adjust outcome-based tests around effort-capped migration ticking and planning gating. + - Add prompt-contract assertions only where behavior is load-bearing (taste injection, staged/live dir constraints). + - Validate PR title regex edge cases with fixture-like checks in workflow script block (no contract change yet). +- required_effort: `low` + +### Phase 2: Refactor internals behind unchanged contracts +- Scope: `src/continuous_refactoring/effort.py`, `src/continuous_refactoring/__main__.py` +- Work: + - Remove tiny internal repetition in effort resolution using small pure helpers. + - Keep CLI-visible semantics identical (`low` default, `xhigh` cap, cap behavior). + - Keep `__main__` minimal; only touch if clarity gain is concrete. +- required_effort: `medium` + +### Phase 3: Optional boundary behavior adjustment (human review) +- Scope: `.github/workflows/pr-title.yml`, related tests/docs if needed +- Work: + - If changing title policy, explicitly name user-facing impact in review prompt and migration notes. + - Update examples/messages to match exact accepted syntax. +- required_effort: `high` + +## Risk profile +- Overall: **Low** +- Main risks: + - Hidden coupling in prompt text expectations causing brittle tests. + - Workflow regex changes silently rejecting valid PRs. +- Mitigations: + - Keep regex changes isolated and example-backed. + - Prefer additive tests before edits. + +## Best fit conditions +Pick this if the goal is reliability and safe incremental cleanup under active usage. diff --git a/migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md b/migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md new file mode 100644 index 0000000..063e45b --- /dev/null +++ b/migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md @@ -0,0 +1,47 @@ +# Test-First Boundary Pruning + +## Strategy +Start from brittle/high-noise tests and workflow checks, reduce assertion noise to behavior-centric coverage, then simplify production code only where tests prove redundancy. + +## Why this path +- Best when suite maintenance cost is rising and prompt/workflow assertions are noisy. +- Aligns with taste: outcome-focused testing, minimal mocks, remove dead/fallback structure. + +## Tradeoffs +- Pros: Faster future iteration, clearer failures, less incidental coupling to wording. +- Cons: Requires discipline to avoid deleting guards that protect true interface contracts. + +## Estimated phases + +### Phase 1: Classify tests by contract vs incidental text +- Scope: `tests/test_prompts.py`, `tests/test_loop_migration_tick.py` +- Work: + - Tag assertions as interface-critical or implementation-detail. + - Rewrite detail-coupled checks into outcome-focused checks. +- required_effort: `low` + +### Phase 2: Prune and tighten boundary checks +- Scope: `.github/workflows/pr-title.yml`, tests above +- Work: + - Keep PR-title rule strict but simplify validation messaging/tests for clarity. + - Ensure prompt tests verify required clauses without overfitting exact prose. +- required_effort: `medium` + +### Phase 3: Opportunistic code cleanup proven by tests +- Scope: `src/continuous_refactoring/effort.py`, `src/continuous_refactoring/__main__.py` +- Work: + - Delete tiny dead branches/helpers shown redundant by updated tests. + - Keep module boundaries and public behavior unchanged. +- required_effort: `medium` + +## Risk profile +- Overall: **Medium-Low** +- Main risks: + - False confidence if pruning removes high-signal assertions. + - Reviewer disagreement on what counts as contract text. +- Mitigations: + - Keep explicit list of must-preserve interface clauses. + - Route any contract relaxation through human review language. + +## Best fit conditions +Pick this if test signal-to-noise and maintenance speed are the biggest pain. diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json new file mode 100644 index 0000000..241f553 --- /dev/null +++ b/migrations/random-files-20260521T140910/manifest.json @@ -0,0 +1,12 @@ +{ + "awaiting_human_review": false, + "cooldown_until": null, + "created_at": "2026-05-21T14:09:10.762-07:00", + "current_phase": "", + "human_review_reason": null, + "last_touch": "2026-05-21T14:10:09.545-07:00", + "name": "random-files-20260521T140910", + "phases": [], + "status": "planning", + "wake_up_on": null +} From e7dc33a343395abc88cec865cfe997a36d84bfbc Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:10:25 -0700 Subject: [PATCH 07/41] continuous refactor: planning/random-files-20260521T140910/pick-best Why: planning.pick-best accepted; next step: expand --- .../.planning/stages/pick-best.stdout.md | 13 +++++++++++++ .../.planning/state.json | 15 +++++++++++++-- .../random-files-20260521T140910/manifest.json | 2 +- 3 files changed, 27 insertions(+), 3 deletions(-) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md new file mode 100644 index 0000000..4369b4d --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md @@ -0,0 +1,13 @@ +**Choice: `interface-first-hardening`** + +It has the best risk/clarity/verifiability balance for this repo right now. + +- Lowest boundary risk: it explicitly protects CLI effort semantics, prompt contracts, and workflow behavior before internal cleanup. +- Best taste match: “safer compatibility paths for shipped systems” and “treat released package interfaces as human-review territory” are first-class in this approach. +- Incremental and testable: Phase 1 is low-effort, outcome-focused guardrails; Phase 2 refactors behind stable contracts; Phase 3 makes interface changes optional and review-gated. + +**Why not the others (short):** +- `effort-engine-consolidation`: cleaner internals, but higher drift risk early; it optimizes maintainability before boundary safety. +- `test-first-boundary-pruning`: good hygiene, but contract-vs-incidental pruning can create reviewer ambiguity and accidental guard loss. + +**Suggested tie-break framing:** choose safety-first contract locking when interfaces are active and shipped; optimize internals second. diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json index 26237b3..11b813b 100644 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -10,16 +10,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:10:25.299-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "pick-best", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "pick-best", + "next_step": "expand", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:09:10.762-07:00", "target": "random files", - "updated_at": "2026-05-21T14:10:09.546-07:00" + "updated_at": "2026-05-21T14:10:25.299-07:00" } diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json index 241f553..a3d090d 100644 --- a/migrations/random-files-20260521T140910/manifest.json +++ b/migrations/random-files-20260521T140910/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:09:10.762-07:00", "current_phase": "", "human_review_reason": null, - "last_touch": "2026-05-21T14:10:09.545-07:00", + "last_touch": "2026-05-21T14:10:25.298-07:00", "name": "random-files-20260521T140910", "phases": [], "status": "planning", From b4cafd74b68141f241da3b68042788047334df7c Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:11:37 -0700 Subject: [PATCH 08/41] continuous refactor: planning/random-files-20260521T140910/expand Why: planning.expand accepted; next step: review --- .../.planning/stages/expand.stdout.md | 16 +++++++ .../.planning/state.json | 15 ++++++- .../manifest.json | 31 +++++++++++-- .../phase-1-boundary-contract-guardrails.md | 37 ++++++++++++++++ ...se-2-internal-effort-resolution-cleanup.md | 39 +++++++++++++++++ ...pr-title-policy-adjustment-review-gated.md | 36 ++++++++++++++++ .../random-files-20260521T140910/plan.md | 43 +++++++++++++++++++ 7 files changed, 212 insertions(+), 5 deletions(-) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md create mode 100644 migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md create mode 100644 migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md create mode 100644 migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md create mode 100644 migrations/random-files-20260521T140910/plan.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md new file mode 100644 index 0000000..0fc425d --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md @@ -0,0 +1,16 @@ +Created in the staged migration workspace: + +- `plan.md` +- `phase-1-boundary-contract-guardrails.md` +- `phase-2-internal-effort-resolution-cleanup.md` +- `phase-3-pr-title-policy-adjustment-review-gated.md` + +What’s included: +- Numbered phases with ordering rationale and dependency graph. +- Phase-local preconditions (no baseline-green restatements). +- Independent `## Definition of Done` and `## Validation Steps` per phase. +- `required_effort` + `effort_reason` on higher-risk phases only: + - Phase 2: `medium` + - Phase 3: `high` + +No files were written outside the staged work directory. diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json index 11b813b..fbe9d58 100644 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -21,16 +21,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:11:37.820-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "expand", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "expand", + "next_step": "review", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:09:10.762-07:00", "target": "random files", - "updated_at": "2026-05-21T14:10:25.299-07:00" + "updated_at": "2026-05-21T14:11:37.820-07:00" } diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json index a3d090d..2dc8c90 100644 --- a/migrations/random-files-20260521T140910/manifest.json +++ b/migrations/random-files-20260521T140910/manifest.json @@ -2,11 +2,36 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-21T14:09:10.762-07:00", - "current_phase": "", + "current_phase": "boundary-contract-guardrails", "human_review_reason": null, - "last_touch": "2026-05-21T14:10:25.298-07:00", + "last_touch": "2026-05-21T14:11:37.820-07:00", "name": "random-files-20260521T140910", - "phases": [], + "phases": [ + { + "done": false, + "effort_reason": null, + "file": "phase-1-boundary-contract-guardrails.md", + "name": "boundary-contract-guardrails", + "precondition": "- No earlier phase in this migration is incomplete. - Files/symbols defining current effort routing, migration ticking/planning gating, prompt taste injection, and PR title validation still exist and are reachable from tests. - The worktree contains no unrelated in-flight edits to the same boundary files that would make observed behavior ambiguous.", + "required_effort": null + }, + { + "done": false, + "effort_reason": "Cross-module effort-resolution cleanup can silently drift CLI/migration behavior without careful contract-preserving verification.", + "file": "phase-2-internal-effort-resolution-cleanup.md", + "name": "internal-effort-resolution-cleanup", + "precondition": "- Phase 1 is complete. - Contract-guarding tests for boundary behavior are present and passing locally. - Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests.", + "required_effort": "medium" + }, + { + "done": false, + "effort_reason": "PR title policy changes are user-facing workflow contract changes and require careful compatibility framing and explicit review communication.", + "file": "phase-3-pr-title-policy-adjustment-review-gated.md", + "name": "pr-title-policy-adjustment-review-gated", + "precondition": "- Phase 1 and Phase 2 are complete. - There is a concrete, documented reason for policy change (not cleanup-only churn). - The current accepted/rejected title behavior is captured by tests so change impact is measurable.", + "required_effort": "high" + } + ], "status": "planning", "wake_up_on": null } diff --git a/migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md b/migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md new file mode 100644 index 0000000..0767dd5 --- /dev/null +++ b/migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md @@ -0,0 +1,37 @@ +# Phase 1: Boundary Contract Guardrails + +## Scope +- `tests/test_prompts.py` +- `tests/test_loop_migration_tick.py` +- `.github/workflows/pr-title.yml` (tests/fixtures/assertions only; no policy change in this phase) +- Any directly related existing tests that validate the same boundary contracts + +## Goals +- Lock current interface behavior with outcome-focused tests before internal refactors. +- Increase confidence around effort-cap behavior and planning gating behavior. +- Make PR title policy edge behavior explicit in testable checks without changing policy semantics. + +## Precondition +- No earlier phase in this migration is incomplete. +- Files/symbols defining current effort routing, migration ticking/planning gating, prompt taste injection, and PR title validation still exist and are reachable from tests. +- The worktree contains no unrelated in-flight edits to the same boundary files that would make observed behavior ambiguous. + +## Implementation Instructions +1. Strengthen/extend tests for migration ticking and planning gating behavior so they assert outcomes (eligibility/deferral/routing), not internal call shapes. +2. Strengthen/extend prompt contract tests only for load-bearing invariants (including Taste section and staged/live planning constraints where already contractual). +3. Add explicit PR-title edge-case checks in workflow-adjacent test coverage/fixtures or equivalent deterministic assertions, while preserving current acceptance behavior. +4. Keep changes small and focused on guardrails; do not refactor production logic in this phase unless needed to enable deterministic testing. + +## Validation Steps +- Run targeted checks first: + - `uv run pytest tests/test_prompts.py` + - `uv run pytest tests/test_loop_migration_tick.py` + - `uv run pytest -k "pr title or pr_title"` +- Run full validation command: + - `uv run pytest` + +## Definition of Done +- Boundary tests covering effort-capped migration/planning behavior and prompt contract invariants are present, deterministic, and passing. +- PR title policy behavior is more explicitly exercised without changing acceptance semantics. +- Full configured validation command passes. +- Repository remains shippable with unchanged external interface behavior. diff --git a/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md b/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md new file mode 100644 index 0000000..8cbb118 --- /dev/null +++ b/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md @@ -0,0 +1,39 @@ +# Phase 2: Internal Effort Resolution Cleanup + +required_effort: medium +effort_reason: Cross-module effort-resolution cleanup can silently drift CLI/migration behavior without careful contract-preserving verification. + +## Scope +- `src/continuous_refactoring/effort.py` +- Minimal adjacent call sites if required for coherence (for example `loop.py`, `migration_tick.py`, or CLI argument plumbing) +- `src/continuous_refactoring/__main__.py` only if there is a concrete readability gain with zero behavior drift +- Tests that validate effort defaults/caps and run semantics + +## Goals +- Reduce internal repetition and improve readability in effort resolution paths. +- Preserve all externally visible effort semantics exactly. + +## Precondition +- Phase 1 is complete. +- Contract-guarding tests for boundary behavior are present and passing locally. +- Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests. + +## Implementation Instructions +1. Introduce small pure helpers to centralize effort normalization/capping where duplication currently exists. +2. Keep behavior identical at boundaries: CLI defaults, cap enforcement, and phase deferral semantics must not change. +3. Keep abstraction depth shallow; prefer direct, readable flow over framework-like layering. +4. Only touch `__main__.py` if it materially reduces ambiguity without changing invocation behavior. + +## Validation Steps +- Run focused effort/CLI/run tests: + - `uv run pytest tests/test_effort.py` + - `uv run pytest tests/test_cli.py tests/test_run.py tests/test_run_once.py` + - `uv run pytest tests/test_loop_migration_tick.py` +- Run full validation command: + - `uv run pytest` + +## Definition of Done +- Internal effort-resolution logic is simpler (less duplication / clearer flow) with no interface drift. +- Existing boundary tests from Phase 1 remain green without contract assertion changes that weaken coverage. +- Full configured validation command passes. +- Repository remains shippable with unchanged user-visible effort behavior. diff --git a/migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md b/migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md new file mode 100644 index 0000000..bae7667 --- /dev/null +++ b/migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md @@ -0,0 +1,36 @@ +# Phase 3: PR Title Policy Adjustment (Review-Gated) + +required_effort: high +effort_reason: PR title policy changes are user-facing workflow contract changes and require careful compatibility framing and explicit review communication. + +## Scope +- `.github/workflows/pr-title.yml` +- Any directly related tests/fixtures/docs that define accepted PR title patterns +- Migration notes/review prompt content that explains interface impact + +## Goals +- Apply a deliberate PR title policy behavior adjustment only if needed. +- Make user-facing impact explicit, concrete, and review-friendly. + +## Precondition +- Phase 1 and Phase 2 are complete. +- There is a concrete, documented reason for policy change (not cleanup-only churn). +- The current accepted/rejected title behavior is captured by tests so change impact is measurable. + +## Implementation Instructions +1. Change PR-title matching behavior only for the explicitly intended cases. +2. Update/add tests to show before/after expectations for affected title examples. +3. Update any user-facing examples/messages so accepted syntax is unambiguous. +4. In review-facing notes, explicitly name the interface behavior change (what titles now pass/fail) and why. + +## Validation Steps +- Run PR-title focused checks: + - `uv run pytest -k "pr title or pr_title or workflow"` +- Run full validation command: + - `uv run pytest` + +## Definition of Done +- PR title policy change is intentional, narrowly scoped, and fully covered by deterministic tests. +- Review notes explicitly describe the interface behavior change and expected user impact. +- Full configured validation command passes. +- Repository remains shippable with workflow behavior updated in a clearly documented, review-gated way. diff --git a/migrations/random-files-20260521T140910/plan.md b/migrations/random-files-20260521T140910/plan.md new file mode 100644 index 0000000..ae8ac10 --- /dev/null +++ b/migrations/random-files-20260521T140910/plan.md @@ -0,0 +1,43 @@ +# Migration Plan: Interface-First Hardening + +## Objective +Harden external contracts first (CLI effort semantics, prompt contract invariants, PR title policy behavior), then clean internals behind those contracts, with any user-visible policy shift isolated and human-review gated. + +## Phase Overview +1. **Phase 1 — Boundary Contract Guardrails** (`phase-1-boundary-contract-guardrails.md`) +2. **Phase 2 — Internal Effort Resolution Cleanup** (`phase-2-internal-effort-resolution-cleanup.md`) +3. **Phase 3 — PR Title Policy Adjustment (Review-Gated)** (`phase-3-pr-title-policy-adjustment-review-gated.md`) + +## Dependency Graph +```mermaid +graph TD + P1[Phase 1: Boundary Contract Guardrails] --> P2[Phase 2: Internal Effort Resolution Cleanup] + P1 --> P3[Phase 3: PR Title Policy Adjustment (Review-Gated)] + P2 --> P3 +``` + +## Why this ordering +- Phase 1 reduces regression risk by making interface expectations executable before any internal movement. +- Phase 2 then refactors internals under locked behavior so cleanup is low-risk and easy to verify. +- Phase 3 is explicitly optional and last because it can alter user-facing PR workflow semantics and should only proceed with intentional review context. + +## Validation Strategy +- Baseline contract is enforced by the harness before refactoring and after each completed phase. +- Each phase also includes targeted, independently runnable checks for its own scope. +- Every phase requires the configured full validation command to pass before completion. + +Validation commands used by phases: +- `uv run pytest` +- `uv run pytest tests/test_prompts.py` +- `uv run pytest tests/test_loop_migration_tick.py` +- `uv run pytest tests/test_effort.py tests/test_cli.py tests/test_run.py tests/test_run_once.py` (or nearest equivalent files if names differ) + +## Interface Risk Management +- Treat CLI behavior, prompt contracts, migration-planning constraints, and PR title policy as interface surfaces. +- Any behavior change to these surfaces must be called out explicitly in phase notes and human review prompts. +- Keep compatibility-first defaults unless the phase explicitly targets a behavior change. + +## Shippability bar per phase +- Repository remains releasable after each phase. +- No partial contract rewrites without matching tests. +- No silent behavior changes at interface boundaries. From ad9b8d1b2c9c70a1f6a07697f7738352a728a0d2 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:12:14 -0700 Subject: [PATCH 09/41] continuous refactor: planning/random-files-20260521T140910/review Why: planning.review accepted; next step: revise --- .../.planning/stages/review.stdout.md | 5 +++++ .../.planning/state.json | 17 ++++++++++++++--- .../random-files-20260521T140910/manifest.json | 2 +- 3 files changed, 20 insertions(+), 4 deletions(-) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/review.stdout.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/review.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/review.stdout.md new file mode 100644 index 0000000..f82d1cb --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/review.stdout.md @@ -0,0 +1,5 @@ +1. **Phase 2 precondition improperly includes harness-owned validation state.** +`phase-2-internal-effort-resolution-cleanup.md` says: “Contract-guarding tests … are present and passing locally.” That’s effectively a baseline-green/fresh-validation precondition and should not be start-gating. Keep it in validation/DoD, not preconditions. + +2. **No other plan-quality violations found.** +Phase ordering and dependencies are coherent and risk-minimizing; each phase is independently shippable/verifiable; preconditions vs Definition of Done are mostly separated correctly; effort labels are valid (`medium`, `high`) with useful reasons; scope stays within migration-planned files; and the plan aligns with injected taste (compatibility-first interface handling, explicit review for interface changes). diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json index fbe9d58..9cb85ac 100644 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -32,16 +32,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:12:14.638-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "review", + "outcome": "findings", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "review", - "review_findings": null, + "next_step": "revise", + "review_findings": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md", "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:09:10.762-07:00", "target": "random files", - "updated_at": "2026-05-21T14:11:37.820-07:00" + "updated_at": "2026-05-21T14:12:14.638-07:00" } diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json index 2dc8c90..21c4046 100644 --- a/migrations/random-files-20260521T140910/manifest.json +++ b/migrations/random-files-20260521T140910/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:09:10.762-07:00", "current_phase": "boundary-contract-guardrails", "human_review_reason": null, - "last_touch": "2026-05-21T14:11:37.820-07:00", + "last_touch": "2026-05-21T14:12:14.637-07:00", "name": "random-files-20260521T140910", "phases": [ { From cdf6e3b98071bb98f6aa163fa906d77753d44c15 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:12:46 -0700 Subject: [PATCH 10/41] continuous refactor: planning/random-files-20260521T140910/revise Why: planning.revise accepted; next step: review-2 --- .../.planning/stages/revise.stdout.md | 11 +++++++++++ .../.planning/state.json | 15 +++++++++++++-- .../random-files-20260521T140910/manifest.json | 4 ++-- .../phase-2-internal-effort-resolution-cleanup.md | 2 +- 4 files changed, 27 insertions(+), 5 deletions(-) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md new file mode 100644 index 0000000..f2af52e --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md @@ -0,0 +1,11 @@ +Updated the migration planning artifacts in the staged work dir, using the chosen approach (`interface-first-hardening`) and addressing the review finding. + +Created/updated: +- `plan.md` +- `phase-1-boundary-contract-guardrails.md` +- `phase-2-internal-effort-resolution-cleanup.md` +- `phase-3-pr-title-policy-adjustment-review-gated.md` + +Key fix applied: +- In `phase-2-internal-effort-resolution-cleanup.md`, the precondition no longer requires tests to be “passing locally” (harness-owned baseline validation). +- It now correctly uses a phase-local precondition: Phase 1 guardrail tests must still exist, while pass/fail remains in validation/Definition of Done. diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json index 9cb85ac..413a66a 100644 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -43,16 +43,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:12:46.368-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "revise", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "revise", + "next_step": "review-2", "review_findings": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md", "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:09:10.762-07:00", "target": "random files", - "updated_at": "2026-05-21T14:12:14.638-07:00" + "updated_at": "2026-05-21T14:12:46.368-07:00" } diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json index 21c4046..35e47ee 100644 --- a/migrations/random-files-20260521T140910/manifest.json +++ b/migrations/random-files-20260521T140910/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:09:10.762-07:00", "current_phase": "boundary-contract-guardrails", "human_review_reason": null, - "last_touch": "2026-05-21T14:12:14.637-07:00", + "last_touch": "2026-05-21T14:12:46.367-07:00", "name": "random-files-20260521T140910", "phases": [ { @@ -20,7 +20,7 @@ "effort_reason": "Cross-module effort-resolution cleanup can silently drift CLI/migration behavior without careful contract-preserving verification.", "file": "phase-2-internal-effort-resolution-cleanup.md", "name": "internal-effort-resolution-cleanup", - "precondition": "- Phase 1 is complete. - Contract-guarding tests for boundary behavior are present and passing locally. - Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests.", + "precondition": "- Phase 1 is complete. - Contract-guarding tests for boundary behavior added in Phase 1 are still present. - Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests.", "required_effort": "medium" }, { diff --git a/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md b/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md index 8cbb118..336c4c6 100644 --- a/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md +++ b/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md @@ -15,7 +15,7 @@ effort_reason: Cross-module effort-resolution cleanup can silently drift CLI/mig ## Precondition - Phase 1 is complete. -- Contract-guarding tests for boundary behavior are present and passing locally. +- Contract-guarding tests for boundary behavior added in Phase 1 are still present. - Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests. ## Implementation Instructions From ed2c693d0c79628d13890506392b840660699625 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:13:33 -0700 Subject: [PATCH 11/41] continuous refactor: planning/random-files-20260521T140910/review-2 Why: planning.review-2 accepted; next step: final-review --- .../.planning/stages/review-2.stdout.md | 1 + .../.planning/state.json | 15 +++++++++++++-- .../random-files-20260521T140910/manifest.json | 2 +- 3 files changed, 15 insertions(+), 3 deletions(-) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md new file mode 100644 index 0000000..9e06085 --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md @@ -0,0 +1 @@ +1. no findings. diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json index 413a66a..b05a538 100644 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -54,16 +54,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:13:33.754-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "review-2", + "outcome": "clear", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "review-2", + "next_step": "final-review", "review_findings": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md", "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:09:10.762-07:00", "target": "random files", - "updated_at": "2026-05-21T14:12:46.368-07:00" + "updated_at": "2026-05-21T14:13:33.754-07:00" } diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json index 35e47ee..1850c6c 100644 --- a/migrations/random-files-20260521T140910/manifest.json +++ b/migrations/random-files-20260521T140910/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:09:10.762-07:00", "current_phase": "boundary-contract-guardrails", "human_review_reason": null, - "last_touch": "2026-05-21T14:12:46.367-07:00", + "last_touch": "2026-05-21T14:13:33.751-07:00", "name": "random-files-20260521T140910", "phases": [ { From ea52e46390011e350019ea3471de0dc32518e872 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:14:11 -0700 Subject: [PATCH 12/41] continuous refactor: planning/random-files-20260521T140910/final-review Why: phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated --- .../.planning/stages/final-review.stdout.md | 5 +++++ .../.planning/state.json | 19 +++++++++++++++---- .../manifest.json | 8 ++++---- 3 files changed, 24 insertions(+), 8 deletions(-) create mode 100644 migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md new file mode 100644 index 0000000..ecd10a3 --- /dev/null +++ b/migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md @@ -0,0 +1,5 @@ +Safety: mostly safe, but not fully auto-safe end-to-end because Phase 3 is an intentional user-facing contract change and correctly demands explicit review. +Human decision points: yes, Phase 3 has a concrete human gate and clear impact framing requirements. +Flaw check: no fundamental flaw; effort tiers are lowest safe (`low`/`medium`/`high`), and preconditions avoid baseline/fresh-validation requirements (validation is only in DoD, which is correct). + +final-decision: approve-needs-human — phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json index b05a538..c961d4f 100644 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ b/migrations/random-files-20260521T140910/.planning/state.json @@ -65,16 +65,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:14:11.163-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "final-review", + "outcome": "approve-needs-human", + "outputs": { + "stdout": "migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md" + } } ], "feedback": [], - "final_decision": null, - "final_reason": null, - "next_step": "final-review", + "final_decision": "approve-needs-human", + "final_reason": "phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated", + "next_step": "terminal-ready-awaiting-human", "review_findings": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md", "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:09:10.762-07:00", "target": "random files", - "updated_at": "2026-05-21T14:13:33.754-07:00" + "updated_at": "2026-05-21T14:14:11.163-07:00" } diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json index 1850c6c..f7e939a 100644 --- a/migrations/random-files-20260521T140910/manifest.json +++ b/migrations/random-files-20260521T140910/manifest.json @@ -1,10 +1,10 @@ { - "awaiting_human_review": false, + "awaiting_human_review": true, "cooldown_until": null, "created_at": "2026-05-21T14:09:10.762-07:00", "current_phase": "boundary-contract-guardrails", - "human_review_reason": null, - "last_touch": "2026-05-21T14:13:33.751-07:00", + "human_review_reason": "phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated", + "last_touch": "2026-05-21T14:14:11.165-07:00", "name": "random-files-20260521T140910", "phases": [ { @@ -32,6 +32,6 @@ "required_effort": "high" } ], - "status": "planning", + "status": "ready", "wake_up_on": null } From 9205b2c42c3ba9d11f52fe580dd5448315a981bd Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:15:21 -0700 Subject: [PATCH 13/41] continuous refactor: plan random-files-20260521T141425 Why: planning.approaches accepted; next step: pick-best --- .../.planning/stages/approaches.stdout.md | 7 +++ .../.planning/state.json | 25 ++++++++++ ...ehavior-first-random-file-stabilization.md | 47 +++++++++++++++++++ .../delete-dead-paths-then-reprove.md | 47 +++++++++++++++++++ ...nimal-entrypoint-and-license-guardrails.md | 45 ++++++++++++++++++ .../manifest.json | 12 +++++ 6 files changed, 183 insertions(+) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md create mode 100644 migrations/random-files-20260521T141425/.planning/state.json create mode 100644 migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md create mode 100644 migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md create mode 100644 migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md create mode 100644 migrations/random-files-20260521T141425/manifest.json diff --git a/migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md new file mode 100644 index 0000000..bfc0b86 --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md @@ -0,0 +1,7 @@ +Created 3 approach files in the staged planning workspace: + +- `approaches/minimal-entrypoint-and-license-guardrails.md` +- `approaches/behavior-first-random-file-stabilization.md` +- `approaches/delete-dead-paths-then-reprove.md` + +Each includes strategy, tradeoffs, phased plan, per-phase `required_effort`, and risk profile, aligned to the provided taste and effort constraints. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json new file mode 100644 index 0000000..9423fcb --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -0,0 +1,25 @@ +{ + "completed_steps": [ + { + "agent": "codex", + "completed_at": "2026-05-21T14:15:21.290-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "approaches", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md" + } + } + ], + "feedback": [], + "final_decision": null, + "final_reason": null, + "next_step": "pick-best", + "review_findings": null, + "revision_base_step_counts": [], + "schema_version": 1, + "started_at": "2026-05-21T14:14:26.016-07:00", + "target": "random files", + "updated_at": "2026-05-21T14:15:21.290-07:00" +} diff --git a/migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md b/migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md new file mode 100644 index 0000000..dad1982 --- /dev/null +++ b/migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md @@ -0,0 +1,47 @@ +# Behavior-First Random File Stabilization + +## Strategy +Use the random-file set to strengthen externally visible behavior first (CLI entry, workflow contracts, migration planning artifacts), then prune only internal duplication proven redundant by tests. + +## Why this path +- Aligns with shipped-interface caution in taste. +- Works well when selected files are heterogeneous and don’t justify one deep module refactor. + +## Tradeoffs +- Pros: High confidence on user-visible behavior; straightforward review story. +- Cons: Internal elegance may remain uneven after this migration. + +## Estimated phases + +### Phase 1: Snapshot current behavior with focused regression tests +- Scope: tests touching selected random-file surfaces +- Work: + - Add targeted assertions for entrypoint behavior and any touched migration/planning contract paths. + - Keep tests based on outcomes, not implementation calls. +- required_effort: `low` + +### Phase 2: Internal cleanup behind stable interfaces +- Scope: only random-targeted source files selected for this migration +- Work: + - Remove dead branches/helpers made unnecessary by current contracts. + - Keep error translation only at module boundaries. +- required_effort: `medium` + +### Phase 3: Human-review checkpoint for interface shifts (only if needed) +- Scope: any CLI, repo-written-file, or workflow contract change discovered in Phase 2 +- Work: + - Explicitly document the behavioral delta and rollout impact. + - Gate publish on review acknowledgment. +- required_effort: `high` + +## Risk profile +- Overall: **Low-Medium** +- Main risks: + - Hidden interface drift during cleanup. + - Random-file coupling surfacing late. +- Mitigations: + - Keep phase 1 tests narrow and contract-oriented. + - Escalate to review gate at first interface delta. + +## Best fit conditions +Pick this when reliability and reviewability matter more than maximal code reduction. diff --git a/migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md b/migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md new file mode 100644 index 0000000..bec0b95 --- /dev/null +++ b/migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md @@ -0,0 +1,47 @@ +# Delete Dead Paths Then Re-prove + +## Strategy +Aggressively remove fallback/legacy code in random-targeted files, then re-prove required behavior with concise tests and boundary checks. + +## Why this path +- Matches taste preference for deleting unused paths in non-shipped internals. +- Delivers the biggest readability gain per line changed when dead code exists. + +## Tradeoffs +- Pros: Strong simplification; future maintenance gets easier quickly. +- Cons: Highest chance of exposing implicit dependencies that looked unused. + +## Estimated phases + +### Phase 1: Dead-path inventory and dependency check +- Scope: random-targeted files plus direct callers/tests +- Work: + - Identify branches/helpers/tables with no live call path. + - Confirm no external contract depends on them. +- required_effort: `medium` + +### Phase 2: Removal pass with boundary-preserving errors +- Scope: selected source files +- Work: + - Delete dead flags/shims/branches outright. + - Preserve or improve exception nesting only at module boundaries. +- required_effort: `high` + +### Phase 3: Re-proof via focused regression + integration checks +- Scope: relevant `tests/test_*.py` +- Work: + - Add/update tests for post-deletion behavior and side effects. + - Confirm full pytest gate remains green. +- required_effort: `medium` + +## Risk profile +- Overall: **Medium-High** +- Main risks: + - Removing code that encodes undocumented edge behavior. + - Larger diff in mixed-scope random files. +- Mitigations: + - Require explicit evidence of deadness before deletion. + - Keep deletions and proof tests in the same phase boundary. + +## Best fit conditions +Pick this when maintainability pain is from stale fallback logic and the team accepts moderate refactor risk. diff --git a/migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md b/migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md new file mode 100644 index 0000000..fe9d2d0 --- /dev/null +++ b/migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md @@ -0,0 +1,45 @@ +# Minimal Entrypoint and License Guardrails + +## Strategy +Treat this as a low-blast-radius hardening pass: lock current behavior around `__main__.py` and repository license presence, then make only clarity-level cleanup that keeps interfaces unchanged. + +## Why this path +- Best when the target set is small and boundary-facing (`__main__`, `LICENSE`). +- Maximizes safety while still producing measurable cleanup. + +## Tradeoffs +- Pros: Very low regression risk; quick to validate with focused tests. +- Cons: Limited structural payoff; does not unlock larger refactors. + +## Estimated phases + +### Phase 1: Contract tests for entry invocation and package execution +- Scope: `tests/test_main_entrypoint.py` (and adjacent entrypoint tests if needed) +- Work: + - Assert `python -m continuous_refactoring` still routes through `cli.cli_main()`. + - Assert no accidental side effects at import time. +- required_effort: `low` + +### Phase 2: Entrypoint micro-cleanup with unchanged behavior +- Scope: `src/continuous_refactoring/__main__.py` +- Work: + - Keep file minimal and explicit; remove any future drift-prone boilerplate if present. + - Preserve module boundary behavior exactly. +- required_effort: `low` + +### Phase 3: Repository metadata guardrails +- Scope: `LICENSE` and tests/docs only if needed +- Work: + - Add a lightweight test/check that required license text file remains present and non-empty. + - Avoid policy changes or content rewrites unless explicitly intended. +- required_effort: `low` + +## Risk profile +- Overall: **Low** +- Main risks: + - Over-testing trivial behavior and creating brittle tests. +- Mitigations: + - Keep assertions outcome-focused and minimal. + +## Best fit conditions +Pick this when the migration goal is safe hygiene and confidence, not deeper architecture change. diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json new file mode 100644 index 0000000..79c6527 --- /dev/null +++ b/migrations/random-files-20260521T141425/manifest.json @@ -0,0 +1,12 @@ +{ + "awaiting_human_review": false, + "cooldown_until": null, + "created_at": "2026-05-21T14:14:26.016-07:00", + "current_phase": "", + "human_review_reason": null, + "last_touch": "2026-05-21T14:15:21.290-07:00", + "name": "random-files-20260521T141425", + "phases": [], + "status": "planning", + "wake_up_on": null +} From ca44265b539a73f5311db0a078a83b3419dfa6ba Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:15:36 -0700 Subject: [PATCH 14/41] continuous refactor: planning/random-files-20260521T141425/pick-best Why: planning.pick-best accepted; next step: expand --- .../.planning/stages/pick-best.stdout.md | 17 +++++++++++++++++ .../.planning/state.json | 15 +++++++++++++-- .../random-files-20260521T141425/manifest.json | 2 +- 3 files changed, 31 insertions(+), 3 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md new file mode 100644 index 0000000..40739c9 --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md @@ -0,0 +1,17 @@ +Chosen approach: **`behavior-first-random-file-stabilization`** + +Why this is the best balance: + +- **Risk:** lower than dead-path-first, because it stabilizes externally visible behavior before cleanup and explicitly adds a human-review gate if interface shifts appear. +- **Clarity:** the phase story is clean and reviewable: prove contracts, refactor behind them, escalate only if boundaries move. +- **Incremental verifiability:** each phase has a crisp verification target and natural rollback point; Phase 1 gives immediate confidence for later deletions. + +Why not the others: + +- `delete-dead-paths-then-reprove`: strong cleanup upside, but too much upfront blast radius for a random-file migration; higher chance of uncovering hidden coupling late. +- `minimal-entrypoint-and-license-guardrails`: safest, but likely under-shoots the migration’s refactoring value unless the random set is truly only boundary trivia. + +Taste tie-break application: + +- Taste asks for **aggressive dead-code deletion** in non-shipped internals, but also **compatibility-first handling** and **human review for released interfaces**. +- This approach best reconciles both by sequencing: lock behavior first, then delete safely, then require explicit interface review if needed. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index 9423fcb..5ff2b4a 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -10,16 +10,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:15:36.621-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "pick-best", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "pick-best", + "next_step": "expand", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-21T14:15:21.290-07:00" + "updated_at": "2026-05-21T14:15:36.621-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 79c6527..700b8f8 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "", "human_review_reason": null, - "last_touch": "2026-05-21T14:15:21.290-07:00", + "last_touch": "2026-05-21T14:15:36.620-07:00", "name": "random-files-20260521T141425", "phases": [], "status": "planning", From d0be6f38ba928c0cfe7af4f979009ed76a035aaf Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:16:32 -0700 Subject: [PATCH 15/41] continuous refactor: planning/random-files-20260521T141425/expand Why: planning.expand accepted; next step: review --- .../.planning/stages/expand.stdout.md | 8 ++++ .../.planning/state.json | 15 ++++++- .../manifest.json | 31 +++++++++++-- .../phase-1-contract-regression-net.md | 32 ++++++++++++++ ...ase-2-internal-cleanup-behind-contracts.md | 34 ++++++++++++++ .../phase-3-interface-shift-review-gate.md | 34 ++++++++++++++ .../random-files-20260521T141425/plan.md | 44 +++++++++++++++++++ 7 files changed, 193 insertions(+), 5 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md create mode 100644 migrations/random-files-20260521T141425/phase-1-contract-regression-net.md create mode 100644 migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md create mode 100644 migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md create mode 100644 migrations/random-files-20260521T141425/plan.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md new file mode 100644 index 0000000..d2b9bea --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md @@ -0,0 +1,8 @@ +Created the planning artifacts in the staged migration workspace: + +- `plan.md` +- `phase-1-contract-regression-net.md` +- `phase-2-internal-cleanup-behind-contracts.md` +- `phase-3-interface-shift-review-gate.md` + +They include numbered phases, dependency ordering, a validation strategy, strict `## Precondition` vs `## Definition of Done` separation, and effort tags/reasons using valid labels (`low`, `medium`, `high`). diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index 5ff2b4a..5221ca8 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -21,16 +21,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:16:32.756-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "expand", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "expand", + "next_step": "review", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-21T14:15:36.621-07:00" + "updated_at": "2026-05-21T14:16:32.756-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 700b8f8..3640145 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -2,11 +2,36 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-21T14:14:26.016-07:00", - "current_phase": "", + "current_phase": "contract-regression-net", "human_review_reason": null, - "last_touch": "2026-05-21T14:15:36.620-07:00", + "last_touch": "2026-05-21T14:16:32.755-07:00", "name": "random-files-20260521T141425", - "phases": [], + "phases": [ + { + "done": false, + "effort_reason": "Adding focused regression coverage is bounded and low-risk.", + "file": "phase-1-contract-regression-net.md", + "name": "contract-regression-net", + "precondition": "- Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - The random-target file set for this migration is still the intended scope, and the target contracts to lock are identifiable in current code/tests.", + "required_effort": "low" + }, + { + "done": false, + "effort_reason": "Safe deletion/refactor across heterogeneous random files requires careful reasoning against contract tests.", + "file": "phase-2-internal-cleanup-behind-contracts.md", + "name": "internal-cleanup-behind-contracts", + "precondition": "- Phase 1 is complete and its regression tests are present. - The source files planned for cleanup remain within the random-targeted migration scope. - Any helper/symbol slated for deletion has at least one surviving behavior-level test path covering its externally observable effects.", + "required_effort": "medium" + }, + { + "done": false, + "effort_reason": "Interface-change triage and review gating is high-stakes and must be exact.", + "file": "phase-3-interface-shift-review-gate.md", + "name": "interface-shift-review-gate", + "precondition": "- Phase 2 is complete. - A concrete interface behavior delta is identified and reproducible. - The migration includes clear artifacts describing the before/after behavior and rollout impact.", + "required_effort": "high" + } + ], "status": "planning", "wake_up_on": null } diff --git a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md new file mode 100644 index 0000000..0e4d538 --- /dev/null +++ b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md @@ -0,0 +1,32 @@ +# Phase 1: Contract Regression Net + +## Goal +Capture and lock current externally visible behavior in focused regression tests for the random-file surfaces touched by this migration. + +## Scope +- Tests only. +- Files under `tests/` that exercise selected random-file contracts (CLI behavior, migration/planning artifact behavior, and other user-observable outcomes touched by this migration target). +- No production behavior changes in this phase. + +## Precondition +- Migration status is `in-progress` and this phase is the manifest `current_phase`. +- No earlier migration phase is incomplete. +- The random-target file set for this migration is still the intended scope, and the target contracts to lock are identifiable in current code/tests. + +## Implementation Instructions +1. Identify externally visible behaviors in the random-targeted surfaces that later cleanup could accidentally change. +2. Add or tighten outcome-based regression tests for those behaviors. +3. Prefer real collaborators and filesystem/git fixtures already used by the suite; avoid interaction-call assertions and unnecessary mocks. +4. Keep assertions precise enough to catch interface drift, especially around CLI outputs/errors and planning/migration artifact semantics. + +## Validation Steps +1. Run the focused tests added/updated for this phase. +2. Run the configured full validation command for the repository. + +## Definition of Done +- A focused regression net exists for the random-targeted externally visible behaviors that this migration may affect. +- Added/updated tests fail when the protected behavior is intentionally broken and pass in the intended implementation. +- The full configured validation command passes. + +required_effort: low +effort_reason: Adding focused regression coverage is bounded and low-risk. diff --git a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md new file mode 100644 index 0000000..c5b93b9 --- /dev/null +++ b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md @@ -0,0 +1,34 @@ +# Phase 2: Internal Cleanup Behind Contracts + +## Goal +Refactor random-targeted internals and delete dead/duplicative paths while preserving behavior proven by Phase 1. + +## Scope +- Only random-targeted source files selected for this migration. +- Internal simplification, dead-path deletion, and readability improvements. +- No intentional changes to released interfaces (CLI behavior, XDG/project state layout, repo-written artifact contracts, migration manifest shape, or other system interaction contracts). + +## Precondition +- Phase 1 is complete and its regression tests are present. +- The source files planned for cleanup remain within the random-targeted migration scope. +- Any helper/symbol slated for deletion has at least one surviving behavior-level test path covering its externally observable effects. + +## Implementation Instructions +1. Remove dead branches/helpers and redundant fallback code now covered by the Phase 1 regression net. +2. Keep module-boundary error translation with nested exceptions; do not introduce intra-module translation churn that hides signal. +3. Prefer small, readability-first abstractions and straightforward control flow; avoid speculative interfaces for single implementations. +4. If cleanup reveals a required interface behavior change, stop interface mutation work and route that delta to Phase 3. + +## Validation Steps +1. Run targeted tests covering the cleaned paths, including Phase 1 regression tests. +2. Run the configured full validation command. +3. Verify no unintended interface behavior delta remains undocumented. + +## Definition of Done +- Dead/duplicative internal paths in scoped random-targeted files are removed or simplified without regressing locked behavior. +- Externally visible behavior covered in Phase 1 remains unchanged. +- Any discovered intentional interface change is isolated and explicitly deferred to Phase 3 review gating. +- The full configured validation command passes. + +required_effort: medium +effort_reason: Safe deletion/refactor across heterogeneous random files requires careful reasoning against contract tests. diff --git a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md new file mode 100644 index 0000000..f0308c0 --- /dev/null +++ b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md @@ -0,0 +1,34 @@ +# Phase 3: Interface-Shift Review Gate + +## Goal +Apply explicit human review gating for any interface behavior change introduced or required by the migration. + +## Scope +- Runs only when Phase 2 identifies an interface delta. +- Documentation and manifest/review-state updates needed to gate automation until human approval. +- Interface deltas include CLI behavior, repo-written files, XDG/project state behavior, migration manifest structure, and other user/install-visible contract changes. + +## Precondition +- Phase 2 is complete. +- A concrete interface behavior delta is identified and reproducible. +- The migration includes clear artifacts describing the before/after behavior and rollout impact. + +## Implementation Instructions +1. Document the exact interface change in migration artifacts with concrete before/after behavior. +2. Record explicit human-review-needed messaging naming the specific interface contract shift and user impact. +3. Ensure automation remains gated (`awaiting_human_review`) until canonical migration review clears it. +4. Keep technical changes minimal in this phase: this is a review gate, not a broad additional refactor. + +## Validation Steps +1. Verify review-facing artifacts clearly describe the interface delta and impact. +2. Verify migration state correctly reflects human-review gating. +3. Run the configured full validation command after any code/artifact updates. + +## Definition of Done +- Every interface behavior change introduced by this migration is explicitly documented with concrete impact. +- Human-review gating is active and unambiguous until review approval. +- No generic "needs review" text is used where interface-specific messaging is required. +- The full configured validation command passes. + +required_effort: high +effort_reason: Interface-change triage and review gating is high-stakes and must be exact. diff --git a/migrations/random-files-20260521T141425/plan.md b/migrations/random-files-20260521T141425/plan.md new file mode 100644 index 0000000..c24847b --- /dev/null +++ b/migrations/random-files-20260521T141425/plan.md @@ -0,0 +1,44 @@ +# Migration Plan: behavior-first-random-file-stabilization + +## Objective +Stabilize externally visible behavior for this random-file migration before internal cleanup, then prune internal dead paths behind those proven contracts, and require explicit human review if shipped interfaces change. + +## Phase Sequence +1. Phase 1 — Contract Regression Net +2. Phase 2 — Internal Cleanup Behind Contracts +3. Phase 3 — Interface-Shift Review Gate (conditional) + +## Dependencies +- Phase 1 has no migration-phase dependency. +- Phase 2 depends on Phase 1 completion. +- Phase 3 depends on Phase 2 and runs only if Phase 2 introduces interface behavior changes. + +## Dependency Graph +```mermaid +graph TD + P1[Phase 1: Contract Regression Net] --> P2[Phase 2: Internal Cleanup Behind Contracts] + P2 --> P3[Phase 3: Interface-Shift Review Gate (Conditional)] +``` + +## Phase Details Index +- [phase-1-contract-regression-net.md](phase-1-contract-regression-net.md) +- [phase-2-internal-cleanup-behind-contracts.md](phase-2-internal-cleanup-behind-contracts.md) +- [phase-3-interface-shift-review-gate.md](phase-3-interface-shift-review-gate.md) + +## Validation Strategy +- Baseline green is enforced by the harness before refactoring and after each completed phase. +- Each phase adds phase-local verification: + - Phase 1 proves current behavior with focused outcome-based regression tests for selected random-file interfaces. + - Phase 2 validates cleanup safety by passing the new regression net and full configured validation command. + - Phase 3 validates that any interface change is explicitly documented and routed for human review before further automation. +- Every phase ends in a shippable state: no partial interface contract changes without either preserving behavior (Phase 2) or gating through explicit review (Phase 3). + +## Risk Controls +- Front-load contract evidence so later deletions are constrained. +- Keep cleanup scoped to random-targeted internal files and remove only paths made redundant by validated contracts. +- Escalate interface deltas to a clear human-review gate with explicit behavior-change text (no generic review messages). + +## Out of Scope +- Broad structural refactors outside random-targeted files. +- Speculative interface redesigns not required by discovered defects. +- Release/version workflow changes. From 1402a115f6d4f3186790a02da3a3e7c89b524109 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:17:14 -0700 Subject: [PATCH 16/41] continuous refactor: planning/random-files-20260521T141425/review Why: planning.review accepted; next step: revise --- .../.planning/stages/review.stdout.md | 5 +++++ .../.planning/state.json | 17 ++++++++++++++--- .../random-files-20260521T141425/manifest.json | 2 +- 3 files changed, 20 insertions(+), 4 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/review.stdout.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/review.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/review.stdout.md new file mode 100644 index 0000000..13a494a --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/review.stdout.md @@ -0,0 +1,5 @@ +1. `phase-3-interface-shift-review-gate.md`: `required_effort: high` looks over-tiered for the scoped work described (documentation + gating-state correctness + validation). This phase explicitly says “keep technical changes minimal,” so `medium` is likely the lowest safe tier; keep `high` only if you expect nontrivial code-path changes during gating. + +2. `phase-1-contract-regression-net.md` precondition is partially soft: “target contracts to lock are identifiable in current code/tests.” That’s directionally right, but it’s not a concrete gate. Tighten it to an observable artifact check (for example, an explicit list of target contracts captured in phase notes/plan artifact) so start/no-start is binary. + +3. `phase-2-internal-cleanup-behind-contracts.md` precondition “helper/symbol slated for deletion has at least one surviving behavior-level test path…” mixes start-gating with implementation-time proof work. That evidence is usually established during execution, not before start. Move this to Definition of Done or validation criteria, and keep precondition to dependency/scope readiness only. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index 5221ca8..443b28f 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -32,16 +32,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:17:14.331-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "review", + "outcome": "findings", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/review.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "review", - "review_findings": null, + "next_step": "revise", + "review_findings": "migrations/random-files-20260521T141425/.planning/stages/review.stdout.md", "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-21T14:16:32.756-07:00" + "updated_at": "2026-05-21T14:17:14.331-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 3640145..07c2cdb 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "contract-regression-net", "human_review_reason": null, - "last_touch": "2026-05-21T14:16:32.755-07:00", + "last_touch": "2026-05-21T14:17:14.330-07:00", "name": "random-files-20260521T141425", "phases": [ { From e775fb44f1f7c90d5d8f79eee35ef326f3abe716 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Thu, 21 May 2026 14:18:25 -0700 Subject: [PATCH 17/41] continuous refactor: planning/random-files-20260521T141425/revise Why: planning.revise accepted; next step: review-2 --- .../.planning/stages/revise.stdout.md | 14 +++++++ .../.planning/state.json | 15 ++++++- .../manifest.json | 16 ++++---- .../phase-1-contract-regression-net.md | 30 +++++++------- ...ase-2-internal-cleanup-behind-contracts.md | 34 ++++++++-------- .../phase-3-interface-shift-review-gate.md | 36 ++++++++--------- .../random-files-20260521T141425/plan.md | 40 +++++++++---------- 7 files changed, 106 insertions(+), 79 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md new file mode 100644 index 0000000..2aadb33 --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md @@ -0,0 +1,14 @@ +Updated in the staged migration workspace: + +- `plan.md` +- `phase-1-contract-regression-net.md` +- `phase-2-internal-cleanup-behind-contracts.md` +- `phase-3-interface-shift-review-gate.md` + +Review findings addressed: + +1. Phase 3 effort lowered from `high` to `medium` with rationale aligned to minimal technical scope. +2. Phase 1 precondition tightened to a binary artifact gate: required `phase-1-contract-inventory.md`. +3. Phase 2 moved deletion-proof requirement out of precondition into validation/DoD; precondition is now dependency/scope readiness only. + +The phases remain ordered for risk reduction, independently verifiable, and shippable at each step. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index 443b28f..ef9d66c 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -43,16 +43,27 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/review.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-21T14:18:25.372-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "revise", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "revise", + "next_step": "review-2", "review_findings": "migrations/random-files-20260521T141425/.planning/stages/review.stdout.md", "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-21T14:17:14.331-07:00" + "updated_at": "2026-05-21T14:18:25.372-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 07c2cdb..e75109d 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -4,32 +4,32 @@ "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "contract-regression-net", "human_review_reason": null, - "last_touch": "2026-05-21T14:17:14.330-07:00", + "last_touch": "2026-05-21T14:18:25.372-07:00", "name": "random-files-20260521T141425", "phases": [ { "done": false, - "effort_reason": "Adding focused regression coverage is bounded and low-risk.", + "effort_reason": "Focused test and artifact work with bounded code movement.", "file": "phase-1-contract-regression-net.md", "name": "contract-regression-net", - "precondition": "- Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - The random-target file set for this migration is still the intended scope, and the target contracts to lock are identifiable in current code/tests.", + "precondition": "- Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - A contract inventory artifact exists at `phase-1-contract-inventory.md` and lists the concrete behaviors this phase will lock.", "required_effort": "low" }, { "done": false, - "effort_reason": "Safe deletion/refactor across heterogeneous random files requires careful reasoning against contract tests.", + "effort_reason": "Internal deletions across heterogeneous random files need careful contract-preserving reasoning.", "file": "phase-2-internal-cleanup-behind-contracts.md", "name": "internal-cleanup-behind-contracts", - "precondition": "- Phase 1 is complete and its regression tests are present. - The source files planned for cleanup remain within the random-targeted migration scope. - Any helper/symbol slated for deletion has at least one surviving behavior-level test path covering its externally observable effects.", + "precondition": "- Phase 1 is complete. - `phase-1-contract-inventory.md` exists and Phase 1 regression coverage is present in the repository. - Candidate source edits remain inside the random-targeted migration scope.", "required_effort": "medium" }, { "done": false, - "effort_reason": "Interface-change triage and review gating is high-stakes and must be exact.", + "effort_reason": "Primarily documentation plus gating-state correctness with limited code-path change.", "file": "phase-3-interface-shift-review-gate.md", "name": "interface-shift-review-gate", - "precondition": "- Phase 2 is complete. - A concrete interface behavior delta is identified and reproducible. - The migration includes clear artifacts describing the before/after behavior and rollout impact.", - "required_effort": "high" + "precondition": "- Phase 2 is complete. - At least one concrete interface behavior delta is documented with reproducible before/after behavior. - The migration still has `awaiting_human_review` unset at phase start, so this phase can set and verify the gate.", + "required_effort": "medium" } ], "status": "planning", diff --git a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md index 0e4d538..e3a227a 100644 --- a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md +++ b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md @@ -1,32 +1,34 @@ # Phase 1: Contract Regression Net ## Goal -Capture and lock current externally visible behavior in focused regression tests for the random-file surfaces touched by this migration. +Capture and lock currently expected externally visible behavior for random-targeted surfaces before cleanup begins. ## Scope -- Tests only. -- Files under `tests/` that exercise selected random-file contracts (CLI behavior, migration/planning artifact behavior, and other user-observable outcomes touched by this migration target). -- No production behavior changes in this phase. +- Test files under `tests/` that exercise random-targeted user-visible behavior. +- Planning artifact update that records exactly which contracts are locked in this phase. +- No production behavior changes. ## Precondition - Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. -- The random-target file set for this migration is still the intended scope, and the target contracts to lock are identifiable in current code/tests. +- A contract inventory artifact exists at `phase-1-contract-inventory.md` and lists the concrete behaviors this phase will lock. ## Implementation Instructions -1. Identify externally visible behaviors in the random-targeted surfaces that later cleanup could accidentally change. -2. Add or tighten outcome-based regression tests for those behaviors. -3. Prefer real collaborators and filesystem/git fixtures already used by the suite; avoid interaction-call assertions and unnecessary mocks. -4. Keep assertions precise enough to catch interface drift, especially around CLI outputs/errors and planning/migration artifact semantics. +1. Build/update `phase-1-contract-inventory.md` with explicit contract bullets (surface, expected behavior, and where it is asserted). +2. Add or tighten outcome-based regression tests for each listed contract. +3. Prefer existing fixtures and real collaborators; avoid mock-heavy interaction assertions. +4. Keep assertions strict enough to detect interface drift in CLI behavior, planning/migration artifact behavior, and other scoped observable outcomes. ## Validation Steps -1. Run the focused tests added/updated for this phase. -2. Run the configured full validation command for the repository. +1. Run focused tests updated for the listed contracts. +2. Demonstrate each new/updated contract test fails when its protected behavior is intentionally broken. +3. Run the configured full validation command. ## Definition of Done -- A focused regression net exists for the random-targeted externally visible behaviors that this migration may affect. -- Added/updated tests fail when the protected behavior is intentionally broken and pass in the intended implementation. +- `phase-1-contract-inventory.md` exists and maps each scoped contract to specific regression coverage. +- Regression tests for listed contracts pass in the intended implementation. +- Evidence was collected during execution that intentionally breaking each protected behavior causes the corresponding test to fail. - The full configured validation command passes. required_effort: low -effort_reason: Adding focused regression coverage is bounded and low-risk. +effort_reason: Focused test and artifact work with bounded code movement. diff --git a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md index c5b93b9..78a3fc8 100644 --- a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md +++ b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md @@ -1,34 +1,34 @@ # Phase 2: Internal Cleanup Behind Contracts ## Goal -Refactor random-targeted internals and delete dead/duplicative paths while preserving behavior proven by Phase 1. +Simplify random-targeted internals and delete dead/redundant paths while preserving Phase 1 locked behavior. ## Scope - Only random-targeted source files selected for this migration. -- Internal simplification, dead-path deletion, and readability improvements. -- No intentional changes to released interfaces (CLI behavior, XDG/project state layout, repo-written artifact contracts, migration manifest shape, or other system interaction contracts). +- Internal readability improvements, dead-path deletion, and control-flow simplification. +- No intentional change to released interfaces (CLI behavior, repo-written files, XDG/project state, migration manifest structure, or other install-visible contracts). ## Precondition -- Phase 1 is complete and its regression tests are present. -- The source files planned for cleanup remain within the random-targeted migration scope. -- Any helper/symbol slated for deletion has at least one surviving behavior-level test path covering its externally observable effects. +- Phase 1 is complete. +- `phase-1-contract-inventory.md` exists and Phase 1 regression coverage is present in the repository. +- Candidate source edits remain inside the random-targeted migration scope. ## Implementation Instructions -1. Remove dead branches/helpers and redundant fallback code now covered by the Phase 1 regression net. -2. Keep module-boundary error translation with nested exceptions; do not introduce intra-module translation churn that hides signal. -3. Prefer small, readability-first abstractions and straightforward control flow; avoid speculative interfaces for single implementations. -4. If cleanup reveals a required interface behavior change, stop interface mutation work and route that delta to Phase 3. +1. Remove dead branches/helpers/fallback paths made unnecessary by current behavior contracts. +2. Keep boundary error translation with exception nesting only at module boundaries. +3. Use small readability-first abstractions only when they reduce repetition or branch complexity. +4. If an interface behavior must change, isolate and document that delta for Phase 3 instead of blending it into broad cleanup. ## Validation Steps -1. Run targeted tests covering the cleaned paths, including Phase 1 regression tests. -2. Run the configured full validation command. -3. Verify no unintended interface behavior delta remains undocumented. +1. Run targeted tests that cover cleaned paths, including all Phase 1 contract tests. +2. Confirm each deleted helper/symbol still has its externally observable behavior protected by surviving behavior-level test paths. +3. Run the configured full validation command. ## Definition of Done -- Dead/duplicative internal paths in scoped random-targeted files are removed or simplified without regressing locked behavior. -- Externally visible behavior covered in Phase 1 remains unchanged. -- Any discovered intentional interface change is isolated and explicitly deferred to Phase 3 review gating. +- Scoped internal dead/redundant paths are removed or simplified without regressing locked behavior. +- Surviving behavior-level tests cover externally observable effects previously provided by removed helpers/symbols. +- Any intentional interface delta is explicitly documented and handed off to Phase 3. - The full configured validation command passes. required_effort: medium -effort_reason: Safe deletion/refactor across heterogeneous random files requires careful reasoning against contract tests. +effort_reason: Internal deletions across heterogeneous random files need careful contract-preserving reasoning. diff --git a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md index f0308c0..635797a 100644 --- a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md +++ b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md @@ -1,34 +1,34 @@ # Phase 3: Interface-Shift Review Gate ## Goal -Apply explicit human review gating for any interface behavior change introduced or required by the migration. +Gate any interface behavior change behind explicit, interface-specific human review before automation continues. ## Scope -- Runs only when Phase 2 identifies an interface delta. -- Documentation and manifest/review-state updates needed to gate automation until human approval. -- Interface deltas include CLI behavior, repo-written files, XDG/project state behavior, migration manifest structure, and other user/install-visible contract changes. +- Executes only when Phase 2 identifies an interface behavior delta. +- Documentation and migration state updates required to communicate and enforce review gating. +- Technical code changes are minimal and limited to what is necessary for correct gating-state behavior. ## Precondition - Phase 2 is complete. -- A concrete interface behavior delta is identified and reproducible. -- The migration includes clear artifacts describing the before/after behavior and rollout impact. +- At least one concrete interface behavior delta is documented with reproducible before/after behavior. +- The migration still has `awaiting_human_review` unset at phase start, so this phase can set and verify the gate. ## Implementation Instructions -1. Document the exact interface change in migration artifacts with concrete before/after behavior. -2. Record explicit human-review-needed messaging naming the specific interface contract shift and user impact. -3. Ensure automation remains gated (`awaiting_human_review`) until canonical migration review clears it. -4. Keep technical changes minimal in this phase: this is a review gate, not a broad additional refactor. +1. Document each interface shift with concrete before/after behavior and user/install impact. +2. Add explicit review messaging that names the exact interface contract change; avoid generic "needs review" wording. +3. Set and verify `awaiting_human_review` gating so automation remains paused until canonical migration review approval. +4. Keep non-gating technical churn out of this phase. ## Validation Steps -1. Verify review-facing artifacts clearly describe the interface delta and impact. -2. Verify migration state correctly reflects human-review gating. -3. Run the configured full validation command after any code/artifact updates. +1. Verify review artifacts clearly describe each interface delta and impact. +2. Verify migration gating state correctly reflects pending human review. +3. Run the configured full validation command after artifact/state updates. ## Definition of Done -- Every interface behavior change introduced by this migration is explicitly documented with concrete impact. -- Human-review gating is active and unambiguous until review approval. -- No generic "needs review" text is used where interface-specific messaging is required. +- All interface behavior changes discovered in this migration are documented with concrete impact statements. +- Human-review gating is active, explicit, and tied to the named interface deltas. +- Review text is interface-specific and non-generic. - The full configured validation command passes. -required_effort: high -effort_reason: Interface-change triage and review gating is high-stakes and must be exact. +required_effort: medium +effort_reason: Primarily documentation plus gating-state correctness with limited code-path change. diff --git a/migrations/random-files-20260521T141425/plan.md b/migrations/random-files-20260521T141425/plan.md index c24847b..9fd8170 100644 --- a/migrations/random-files-20260521T141425/plan.md +++ b/migrations/random-files-20260521T141425/plan.md @@ -1,17 +1,17 @@ # Migration Plan: behavior-first-random-file-stabilization ## Objective -Stabilize externally visible behavior for this random-file migration before internal cleanup, then prune internal dead paths behind those proven contracts, and require explicit human review if shipped interfaces change. +Stabilize externally visible behavior for the random-file target first, perform internal cleanup behind that locked behavior, and gate any interface shift behind explicit human review. ## Phase Sequence -1. Phase 1 — Contract Regression Net -2. Phase 2 — Internal Cleanup Behind Contracts -3. Phase 3 — Interface-Shift Review Gate (conditional) +1. Phase 1 - Contract Regression Net +2. Phase 2 - Internal Cleanup Behind Contracts +3. Phase 3 - Interface-Shift Review Gate (conditional) ## Dependencies -- Phase 1 has no migration-phase dependency. +- Phase 1 has no phase dependency. - Phase 2 depends on Phase 1 completion. -- Phase 3 depends on Phase 2 and runs only if Phase 2 introduces interface behavior changes. +- Phase 3 depends on Phase 2 completion and executes only when an interface behavior change exists. ## Dependency Graph ```mermaid @@ -20,25 +20,25 @@ graph TD P2 --> P3[Phase 3: Interface-Shift Review Gate (Conditional)] ``` -## Phase Details Index +## Phase Artifacts - [phase-1-contract-regression-net.md](phase-1-contract-regression-net.md) - [phase-2-internal-cleanup-behind-contracts.md](phase-2-internal-cleanup-behind-contracts.md) - [phase-3-interface-shift-review-gate.md](phase-3-interface-shift-review-gate.md) ## Validation Strategy -- Baseline green is enforced by the harness before refactoring and after each completed phase. -- Each phase adds phase-local verification: - - Phase 1 proves current behavior with focused outcome-based regression tests for selected random-file interfaces. - - Phase 2 validates cleanup safety by passing the new regression net and full configured validation command. - - Phase 3 validates that any interface change is explicitly documented and routed for human review before further automation. -- Every phase ends in a shippable state: no partial interface contract changes without either preserving behavior (Phase 2) or gating through explicit review (Phase 3). - -## Risk Controls -- Front-load contract evidence so later deletions are constrained. -- Keep cleanup scoped to random-targeted internal files and remove only paths made redundant by validated contracts. -- Escalate interface deltas to a clear human-review gate with explicit behavior-change text (no generic review messages). +- Harness baseline guarantees configured validation is green before refactoring and after each completed phase. +- Each phase adds independent, phase-local checks: + - Phase 1: records a concrete contract inventory and adds outcome-focused regression coverage for those contracts. + - Phase 2: proves internal deletion/simplification keeps locked behavior stable and documents any discovered interface delta. + - Phase 3: verifies interface-delta documentation quality plus correct human-review gating state. +- A phase is complete only when its Definition of Done is met and the configured validation command passes. + +## Risk Reduction Order +- Front-load behavior locking so later cleanup has a hard safety rail. +- Restrict cleanup to scoped random-target internals and remove stale paths only when protected by behavior checks. +- Isolate interface-risk work into a dedicated review-gated phase so repository state remains shippable. ## Out of Scope -- Broad structural refactors outside random-targeted files. -- Speculative interface redesigns not required by discovered defects. +- Structural refactors outside the random-target file set. +- Speculative interface redesign. - Release/version workflow changes. From d743a635e6e464b289f10fd8eb68a26d8cc7211d Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Fri, 22 May 2026 16:11:08 -0700 Subject: [PATCH 18/41] fix planning bug with refine --- AGENTS.md | 2 +- src/continuous_refactoring/planning.py | 4 +- src/continuous_refactoring/planning_state.py | 13 +++-- tests/test_planning.py | 53 ++++++++++++++++++++ tests/test_planning_state.py | 12 ++--- 5 files changed, 72 insertions(+), 12 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index efc255a..f9b5b8e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -128,7 +128,7 @@ active phase explicitly names `loop.py` in scope. - **Migration tick deferral writes** (`migration_tick.py`) — ready-check deferrals are queued while scanning candidates and saved only when the tick finds no executable phase or blocks for human review. Do not save a deferred manifest before checking later candidates; that dirties the worktree and can make ready-checks reject runnable phases. - **Migration visibility + consistency gate** (`migration_consistency.py`, `migration_tick.py`, `loop.py`, `review_cli.py`) — candidate scans use `iter_visible_migration_dirs()` so hidden/dotted/internal/symlink dirs are invisible to tick/list/review commands. Before ready-check, `execution-gate` consistency errors block phase execution; `info`/`warning` never block. - **Manifest codec boundary** (`migration_manifest_codec.py`, `migrations.py`) — codec owns legacy `ready_when`, legacy integer `current_phase`, duplicate phase-name rejection, and saved JSON formatting. `load_manifest()` / `save_manifest()` own filesystem and JSON boundary errors. -- **Planning state codec boundary** (`planning_state.py`, `planning.py`) — `.planning/state.json` is valid only when completed steps replay through the branching planning graph to `next_step`; recorded outputs must be repo-relative files inside the migration directory. User refinement feedback is durable state, and append-only `revision_base_step_counts` anchors let unexecuted ready migrations reuse `revise` after terminal ready decisions; legacy `revision_base_step_count` decodes as one anchor. Persist accepted step stdout after the step is validated; do not add durable fields for failed current-step output. +- **Planning state codec boundary** (`planning_state.py`, `planning.py`) — `.planning/state.json` is valid only when completed steps replay through the branching planning graph to `next_step`; recorded outputs must be repo-relative files inside the migration directory. User refinement feedback is durable state, and append-only `revision_base_step_counts` anchors let refinement reuse `revise` from review cursors or unexecuted ready terminal decisions; legacy `revision_base_step_count` decodes as one anchor. Persist accepted step stdout after the step is validated; do not add durable fields for failed current-step output. - **Planning publish transaction** (`planning_publish.py`) — publish copies the complete workspace snapshot to `__transactions__//staged`, validates it, checks same-device and `base_snapshot_id`, moves live to `rollback`, moves staged live, validates live, then deletes rollback. On post-rollback failure, move bad live to `failed` before restoring rollback. Transaction directories are invisible to scheduling/list candidates but visible to `migration doctor --all`. Do not bypass the lock or dirty-live check. - **One-step planning engine** (`planning.py`) — product planning entry points call `run_next_planning_step()` so one action runs exactly `PlanningState.next_step`, records accepted stdout/state in an off-live workspace, and publishes through `planning_publish.py`. Failed current-step output is never durable resume input. `run_planning` is intentionally not package-exported. - **Planning resume scheduling** (`migration_tick.py`, `loop.py`, `routing_pipeline.py`) — normal automation runs one eligible `status: planning` step before ready/in-progress phase ticks and before source-target routing. Missing or invalid `.planning/state.json` blocks automation with planning failure evidence; `status: planning` must never enter phase ready-check or phase execution. diff --git a/src/continuous_refactoring/planning.py b/src/continuous_refactoring/planning.py index 88b3676..5abfe4c 100644 --- a/src/continuous_refactoring/planning.py +++ b/src/continuous_refactoring/planning.py @@ -74,6 +74,7 @@ _EFFORT_REASON_LINE_RE = re.compile( r"^effort_reason:\s*(.+)$", re.IGNORECASE | re.MULTILINE, ) +_REFINE_REOPEN_STEPS = frozenset(("review", "review-2", "final-review")) @dataclass(frozen=True) @@ -1010,8 +1011,9 @@ def _prepare_refine_state( ) -> tuple[MigrationManifest, PlanningState]: _require_refine_eligible(manifest) state = append_planning_feedback(state, feedback_text, feedback_source) - if manifest.status == "ready": + if manifest.status == "ready" or state.next_step in _REFINE_REOPEN_STEPS: state = reopen_planning_for_revise(state) + if manifest.status == "ready": manifest = _refresh_manifest( manifest, workspace_root / "manifest.json", diff --git a/src/continuous_refactoring/planning_state.py b/src/continuous_refactoring/planning_state.py index 6f8ac5e..1aea233 100644 --- a/src/continuous_refactoring/planning_state.py +++ b/src/continuous_refactoring/planning_state.py @@ -72,6 +72,13 @@ _STEP_OUTCOMES: tuple[str, ...] = cast(tuple[str, ...], get_args(PlanningStepOutcome)) _COMPLETED_OUTCOME = "completed" +_REOPENABLE_REVISE_CURSORS = ( + "review", + "review-2", + "final-review", + "terminal-ready", + "terminal-ready-awaiting-human", +) _TERMINAL_BY_DECISION: dict[str, TerminalPlanningCursor] = { "approve-auto": "terminal-ready", "approve-needs-human": "terminal-ready-awaiting-human", @@ -265,7 +272,7 @@ def reopen_planning_for_revise( now: str | None = None, ) -> PlanningState: replay = _replay_details(state) - if replay.next_step not in ("terminal-ready", "terminal-ready-awaiting-human"): + if replay.next_step not in _REOPENABLE_REVISE_CURSORS: raise ContinuousRefactorError( f"Cannot reopen planning state at {replay.next_step!r} for revise" ) @@ -544,10 +551,10 @@ def _validate_revision_base_step_counts(state: PlanningState) -> None: def _reopen_cursor( cursor: PlanningCursor, ) -> tuple[PlanningCursor, str | None, FinalPlanningDecision | None]: - if cursor not in ("terminal-ready", "terminal-ready-awaiting-human"): + if cursor not in _REOPENABLE_REVISE_CURSORS: raise ContinuousRefactorError( "Planning state revision_base_step_counts must point at a " - f"terminal ready cursor, got {cursor!r}" + f"reopenable review cursor, got {cursor!r}" ) return "revise", None, None diff --git a/tests/test_planning.py b/tests/test_planning.py index b6da35b..285704f 100644 --- a/tests/test_planning.py +++ b/tests/test_planning.py @@ -795,6 +795,59 @@ def test_refine_ready_reopen_runs_one_revise_step( assert state.revision_base_step_counts == (5,) +def test_refine_review_two_reopens_to_revise( + tmp_path: Path, + monkeypatch: pytest.MonkeyPatch, +) -> None: + repo_root, live_dir, mig_root = _planning_repo_context(tmp_path, monkeypatch) + _seed_planning_snapshot( + repo_root, + live_dir, + [ + ("approaches", "completed", "Generated approaches.\n"), + ("pick-best", "completed", "Chose incremental.\n"), + ("expand", "completed", "Expanded.\n"), + ("review", "findings", "Phase 1 over-gates.\n"), + ("revise", "completed", "Revised once.\n"), + ], + plan_text="# Plan v1\n", + phase_text=_phase_doc("inventory already exists", "Inventory is current."), + ) + + result, mock = _run_refine_step( + repo_root, + live_dir, + [ + _workspace_response( + "Revised with feedback.\n", + { + "plan.md": "# Plan v2\n", + "phase-0-setup.md": _phase_doc( + "source files are present", + "Inventory is created or updated.", + ), + }, + ) + ], + monkeypatch, + feedback="Move inventory existence from precondition to DoD.", + ) + + state = load_planning_state(repo_root, planning_state_path(mig_root)) + assert result.status == "published" + assert result.step == "revise" + assert mock.stage_labels == ["revise"] + assert "User refinement feedback" in mock.prompts[0] + assert state.next_step == "review-2" + assert state.revision_base_step_counts == (5,) + assert state.feedback[-1].text == ( + "Move inventory existence from precondition to DoD." + ) + assert ( + mig_root / ".planning" / "stages" / "revise-2.stdout.md" + ).read_text(encoding="utf-8") == "Revised with feedback.\n" + + def test_refine_repeated_steps_keep_original_stdout_history( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, diff --git a/tests/test_planning_state.py b/tests/test_planning_state.py index c0f2abf..56c107b 100644 --- a/tests/test_planning_state.py +++ b/tests/test_planning_state.py @@ -415,7 +415,7 @@ def test_planning_state_allows_null_legacy_anchor_with_current_revision_anchors( assert loaded.revision_base_step_counts == (5,) -def test_planning_state_rejects_revision_anchor_at_non_terminal_cursor( +def test_planning_state_allows_revision_anchor_at_review_cursor( tmp_path: Path, ) -> None: repo_root, mig_root = _migration_root(tmp_path) @@ -433,11 +433,9 @@ def test_planning_state_rejects_revision_anchor_at_non_terminal_cursor( payload["revision_base_step_counts"] = [3] _write_state_payload(path, payload) - with pytest.raises( - ContinuousRefactorError, - match="must point at a terminal ready cursor, got 'review'", - ): - load_planning_state(repo_root, path) + loaded = load_planning_state(repo_root, path) + assert loaded.next_step == "revise" + assert loaded.revision_base_step_counts == (3,) def test_planning_state_rejects_revision_anchor_at_skipped_terminal_cursor( @@ -462,7 +460,7 @@ def test_planning_state_rejects_revision_anchor_at_skipped_terminal_cursor( with pytest.raises( ContinuousRefactorError, - match="must point at a terminal ready cursor, got 'terminal-skipped'", + match="must point at a reopenable review cursor, got 'terminal-skipped'", ): load_planning_state(repo_root, path) From 23e283020b9d66c338d1c7c6ade014e65df7e2ed Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Fri, 22 May 2026 16:14:31 -0700 Subject: [PATCH 19/41] fix up migrations --- .../.planning/stages/revise-2.stdout.md | 11 ++++++ .../.planning/state.json | 27 +++++++++++--- .../manifest.json | 16 ++++----- .../phase-1-contract-regression-net.md | 30 ++++++++-------- ...ase-2-internal-cleanup-behind-contracts.md | 22 ++++++------ .../phase-3-interface-shift-review-gate.md | 26 +++++++------- .../random-files-20260521T141425/plan.md | 36 +++++++++---------- 7 files changed, 99 insertions(+), 69 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md new file mode 100644 index 0000000..6dd07a1 --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md @@ -0,0 +1,11 @@ +Updated the migration planning artifacts in the staged work dir with the requested refinements: + +- [plan.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/plan.md) +- [phase-1-contract-regression-net.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/phase-1-contract-regression-net.md) +- [phase-2-internal-cleanup-behind-contracts.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md) +- [phase-3-interface-shift-review-gate.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/phase-3-interface-shift-review-gate.md) + +What changed per your feedback: +- Removed `phase-1-contract-inventory.md` from Phase 1 preconditions and kept it in Phase 1 Definition of Done. +- Tightened Phase 1 scope to explicitly bound it to behavior tests plus the single inventory artifact, with minimal production edits only when strictly needed for test observability. +- Raised Phase 3 `required_effort` to `high` and added a concrete `effort_reason` tied to release-facing interface-risk and gating correctness. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index ef9d66c..70160f4 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -54,16 +54,35 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-22T16:13:31.328-07:00", + "effort": "high", + "model": "gpt-5.3-codex", + "name": "revise", + "outcome": "completed", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md" + } + } + ], + "feedback": [ + { + "received_at": "2026-05-22T16:11:53.753-07:00", + "source": "message", + "text": "Fix review-2 findings: move phase-1-contract-inventory.md out of Phase 1 precondition\n and into Definition of Done; tighten Phase 1 scope wording; make Phase 3 required_effort match the\n actual gating risk with a concrete effort_reason." } ], - "feedback": [], "final_decision": null, "final_reason": null, "next_step": "review-2", - "review_findings": "migrations/random-files-20260521T141425/.planning/stages/review.stdout.md", - "revision_base_step_counts": [], + "review_findings": null, + "revision_base_step_counts": [ + 5 + ], "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-21T14:18:25.372-07:00" + "updated_at": "2026-05-22T16:13:31.328-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index e75109d..17f8106 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -4,32 +4,32 @@ "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "contract-regression-net", "human_review_reason": null, - "last_touch": "2026-05-21T14:18:25.372-07:00", + "last_touch": "2026-05-22T16:13:31.327-07:00", "name": "random-files-20260521T141425", "phases": [ { "done": false, - "effort_reason": "Focused test and artifact work with bounded code movement.", + "effort_reason": "Bounded, test-first contract capture with minimal production churn.", "file": "phase-1-contract-regression-net.md", "name": "contract-regression-net", - "precondition": "- Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - A contract inventory artifact exists at `phase-1-contract-inventory.md` and lists the concrete behaviors this phase will lock.", + "precondition": "- Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - Random-targeted files and their relevant behavior surfaces are still present and identifiable.", "required_effort": "low" }, { "done": false, - "effort_reason": "Internal deletions across heterogeneous random files need careful contract-preserving reasoning.", + "effort_reason": "Cross-file internal cleanup with contract-preserving constraints needs careful sequencing.", "file": "phase-2-internal-cleanup-behind-contracts.md", "name": "internal-cleanup-behind-contracts", - "precondition": "- Phase 1 is complete. - `phase-1-contract-inventory.md` exists and Phase 1 regression coverage is present in the repository. - Candidate source edits remain inside the random-targeted migration scope.", + "precondition": "- Phase 1 is complete. - `phase-1-contract-inventory.md` and its Phase 1 regression coverage are present. - Candidate edits remain inside random-targeted migration scope.", "required_effort": "medium" }, { "done": false, - "effort_reason": "Primarily documentation plus gating-state correctness with limited code-path change.", + "effort_reason": "This phase controls release-facing interface risk and human-review gating correctness, so mistakes can unblock unsafe automation.", "file": "phase-3-interface-shift-review-gate.md", "name": "interface-shift-review-gate", - "precondition": "- Phase 2 is complete. - At least one concrete interface behavior delta is documented with reproducible before/after behavior. - The migration still has `awaiting_human_review` unset at phase start, so this phase can set and verify the gate.", - "required_effort": "medium" + "precondition": "- Phase 2 is complete. - At least one concrete interface behavior delta is documented with reproducible before/after behavior. - The migration has not already been marked `awaiting_human_review` for the same deltas.", + "required_effort": "high" } ], "status": "planning", diff --git a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md index e3a227a..509c9a0 100644 --- a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md +++ b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md @@ -1,34 +1,34 @@ # Phase 1: Contract Regression Net ## Goal -Capture and lock currently expected externally visible behavior for random-targeted surfaces before cleanup begins. +Lock currently expected externally visible behavior for random-targeted surfaces before internal cleanup starts. ## Scope -- Test files under `tests/` that exercise random-targeted user-visible behavior. -- Planning artifact update that records exactly which contracts are locked in this phase. -- No production behavior changes. +- Only tests under `tests/` that assert behavior of random-targeted user-facing surfaces (CLI behavior, repo-written artifacts, workflow outputs). +- Create/update one inventory artifact: `phase-1-contract-inventory.md`. +- No production source edits beyond minimal changes strictly required to make missing behavior observable in tests. ## Precondition - Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. -- A contract inventory artifact exists at `phase-1-contract-inventory.md` and lists the concrete behaviors this phase will lock. +- Random-targeted files and their relevant behavior surfaces are still present and identifiable. ## Implementation Instructions -1. Build/update `phase-1-contract-inventory.md` with explicit contract bullets (surface, expected behavior, and where it is asserted). -2. Add or tighten outcome-based regression tests for each listed contract. -3. Prefer existing fixtures and real collaborators; avoid mock-heavy interaction assertions. -4. Keep assertions strict enough to detect interface drift in CLI behavior, planning/migration artifact behavior, and other scoped observable outcomes. +1. Create or update `phase-1-contract-inventory.md` with explicit contract bullets: surface, expected behavior, and asserting test location. +2. Add or tighten outcome-based regression tests for every listed contract. +3. Prefer real collaborators and existing fixtures; avoid interaction-level mocks unless boundary isolation is necessary. +4. Keep assertions strict enough to detect interface drift in scoped observable outcomes. ## Validation Steps -1. Run focused tests updated for the listed contracts. -2. Demonstrate each new/updated contract test fails when its protected behavior is intentionally broken. +1. Run focused tests that cover the listed contracts. +2. For each listed contract, intentionally break the behavior and verify the corresponding test fails. 3. Run the configured full validation command. ## Definition of Done -- `phase-1-contract-inventory.md` exists and maps each scoped contract to specific regression coverage. -- Regression tests for listed contracts pass in the intended implementation. -- Evidence was collected during execution that intentionally breaking each protected behavior causes the corresponding test to fail. +- `phase-1-contract-inventory.md` exists and maps each scoped contract to concrete regression coverage. +- Every contract listed in that inventory has passing outcome-based regression coverage. +- Execution evidence shows each listed contract test fails when its protected behavior is intentionally broken. - The full configured validation command passes. required_effort: low -effort_reason: Focused test and artifact work with bounded code movement. +effort_reason: Bounded, test-first contract capture with minimal production churn. diff --git a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md index 78a3fc8..8e9848d 100644 --- a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md +++ b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md @@ -1,7 +1,7 @@ # Phase 2: Internal Cleanup Behind Contracts ## Goal -Simplify random-targeted internals and delete dead/redundant paths while preserving Phase 1 locked behavior. +Simplify random-targeted internals and remove dead/redundant paths while preserving Phase 1 locked behavior. ## Scope - Only random-targeted source files selected for this migration. @@ -10,25 +10,25 @@ Simplify random-targeted internals and delete dead/redundant paths while preserv ## Precondition - Phase 1 is complete. -- `phase-1-contract-inventory.md` exists and Phase 1 regression coverage is present in the repository. -- Candidate source edits remain inside the random-targeted migration scope. +- `phase-1-contract-inventory.md` and its Phase 1 regression coverage are present. +- Candidate edits remain inside random-targeted migration scope. ## Implementation Instructions -1. Remove dead branches/helpers/fallback paths made unnecessary by current behavior contracts. -2. Keep boundary error translation with exception nesting only at module boundaries. -3. Use small readability-first abstractions only when they reduce repetition or branch complexity. -4. If an interface behavior must change, isolate and document that delta for Phase 3 instead of blending it into broad cleanup. +1. Remove dead branches/helpers/fallback paths that are unnecessary under current contracts. +2. Translate/wrap errors only at module boundaries; preserve signal when bubbling within a module. +3. Introduce small abstractions only when they reduce repetition or branch complexity. +4. If an interface behavior change is required, isolate and document that delta for Phase 3 instead of blending it into broad cleanup. ## Validation Steps -1. Run targeted tests that cover cleaned paths, including all Phase 1 contract tests. -2. Confirm each deleted helper/symbol still has its externally observable behavior protected by surviving behavior-level test paths. +1. Run targeted tests covering cleaned paths, including all Phase 1 contract tests. +2. Confirm surviving behavior-level tests still cover externally observable effects previously provided by removed helpers/symbols. 3. Run the configured full validation command. ## Definition of Done - Scoped internal dead/redundant paths are removed or simplified without regressing locked behavior. - Surviving behavior-level tests cover externally observable effects previously provided by removed helpers/symbols. -- Any intentional interface delta is explicitly documented and handed off to Phase 3. +- Any intentional interface delta is explicitly documented for Phase 3. - The full configured validation command passes. required_effort: medium -effort_reason: Internal deletions across heterogeneous random files need careful contract-preserving reasoning. +effort_reason: Cross-file internal cleanup with contract-preserving constraints needs careful sequencing. diff --git a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md index 635797a..036a9e3 100644 --- a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md +++ b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md @@ -1,34 +1,34 @@ # Phase 3: Interface-Shift Review Gate ## Goal -Gate any interface behavior change behind explicit, interface-specific human review before automation continues. +Gate any discovered interface behavior change behind explicit, interface-specific human review before automation proceeds. ## Scope -- Executes only when Phase 2 identifies an interface behavior delta. -- Documentation and migration state updates required to communicate and enforce review gating. -- Technical code changes are minimal and limited to what is necessary for correct gating-state behavior. +- Runs only if Phase 2 identifies at least one interface behavior delta. +- Documentation and migration state updates needed to communicate and enforce review gating. +- No unrelated cleanup or broad production refactor work. ## Precondition - Phase 2 is complete. - At least one concrete interface behavior delta is documented with reproducible before/after behavior. -- The migration still has `awaiting_human_review` unset at phase start, so this phase can set and verify the gate. +- The migration has not already been marked `awaiting_human_review` for the same deltas. ## Implementation Instructions 1. Document each interface shift with concrete before/after behavior and user/install impact. -2. Add explicit review messaging that names the exact interface contract change; avoid generic "needs review" wording. -3. Set and verify `awaiting_human_review` gating so automation remains paused until canonical migration review approval. -4. Keep non-gating technical churn out of this phase. +2. Ensure review messaging names the exact interface contract change; avoid generic "needs review" language. +3. Set and verify `awaiting_human_review` so automation remains paused until canonical migration review approval. +4. Keep this phase focused on gating correctness and communication quality. ## Validation Steps -1. Verify review artifacts clearly describe each interface delta and impact. -2. Verify migration gating state correctly reflects pending human review. +1. Verify review artifacts clearly and specifically describe each interface delta and impact. +2. Verify migration gating state is active and accurately tied to the documented deltas. 3. Run the configured full validation command after artifact/state updates. ## Definition of Done - All interface behavior changes discovered in this migration are documented with concrete impact statements. - Human-review gating is active, explicit, and tied to the named interface deltas. -- Review text is interface-specific and non-generic. +- Review messaging is interface-specific and non-generic. - The full configured validation command passes. -required_effort: medium -effort_reason: Primarily documentation plus gating-state correctness with limited code-path change. +required_effort: high +effort_reason: This phase controls release-facing interface risk and human-review gating correctness, so mistakes can unblock unsafe automation. diff --git a/migrations/random-files-20260521T141425/plan.md b/migrations/random-files-20260521T141425/plan.md index 9fd8170..33382b5 100644 --- a/migrations/random-files-20260521T141425/plan.md +++ b/migrations/random-files-20260521T141425/plan.md @@ -1,9 +1,9 @@ # Migration Plan: behavior-first-random-file-stabilization ## Objective -Stabilize externally visible behavior for the random-file target first, perform internal cleanup behind that locked behavior, and gate any interface shift behind explicit human review. +Lock externally visible behavior for the random-file target first, then refactor internals behind that contract, and gate any interface shift behind explicit human review. -## Phase Sequence +## Phases 1. Phase 1 - Contract Regression Net 2. Phase 2 - Internal Cleanup Behind Contracts 3. Phase 3 - Interface-Shift Review Gate (conditional) @@ -11,7 +11,7 @@ Stabilize externally visible behavior for the random-file target first, perform ## Dependencies - Phase 1 has no phase dependency. - Phase 2 depends on Phase 1 completion. -- Phase 3 depends on Phase 2 completion and executes only when an interface behavior change exists. +- Phase 3 depends on Phase 2 completion and runs only if Phase 2 surfaces an interface behavior delta. ## Dependency Graph ```mermaid @@ -20,25 +20,25 @@ graph TD P2 --> P3[Phase 3: Interface-Shift Review Gate (Conditional)] ``` -## Phase Artifacts +## Phase Documents - [phase-1-contract-regression-net.md](phase-1-contract-regression-net.md) - [phase-2-internal-cleanup-behind-contracts.md](phase-2-internal-cleanup-behind-contracts.md) - [phase-3-interface-shift-review-gate.md](phase-3-interface-shift-review-gate.md) ## Validation Strategy -- Harness baseline guarantees configured validation is green before refactoring and after each completed phase. -- Each phase adds independent, phase-local checks: - - Phase 1: records a concrete contract inventory and adds outcome-focused regression coverage for those contracts. - - Phase 2: proves internal deletion/simplification keeps locked behavior stable and documents any discovered interface delta. - - Phase 3: verifies interface-delta documentation quality plus correct human-review gating state. -- A phase is complete only when its Definition of Done is met and the configured validation command passes. - -## Risk Reduction Order -- Front-load behavior locking so later cleanup has a hard safety rail. -- Restrict cleanup to scoped random-target internals and remove stale paths only when protected by behavior checks. -- Isolate interface-risk work into a dedicated review-gated phase so repository state remains shippable. +- The harness enforces configured validation before refactoring and after each completed phase. +- Each phase adds independent checks: + - Phase 1: defines a concrete contract inventory and proves each listed contract has outcome-based regression coverage. + - Phase 2: proves internal deletions/simplifications preserve the locked contracts while staying within random-target scope. + - Phase 3: proves interface-delta documentation and `awaiting_human_review` gating are explicit, correct, and actionable. +- A phase counts as complete only when its Definition of Done is satisfied and the configured validation command passes. + +## Risk-Reduction Ordering +- Phase 1 reduces ambiguity by locking behavior before any cleanup. +- Phase 2 captures most refactor value while constrained by Phase 1 protections. +- Phase 3 isolates high-risk interface changes into explicit review gating instead of mixing that risk into general cleanup. ## Out of Scope -- Structural refactors outside the random-target file set. -- Speculative interface redesign. -- Release/version workflow changes. +- Structural refactors outside the random-targeted file set. +- Speculative interface redesign unrelated to discovered deltas. +- Release/version process changes. From c228c154600ad0967f48ad1ac4c27c7592072aa6 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Fri, 22 May 2026 16:16:15 -0700 Subject: [PATCH 20/41] continuous refactor: planning/random-files-20260521T141425/review-2 Why: planning.review-2 accepted; next step: final-review --- .../.planning/stages/review-2.stdout.md | 1 + .../.planning/state.json | 15 +++++++++++++-- .../random-files-20260521T141425/manifest.json | 2 +- 3 files changed, 15 insertions(+), 3 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md new file mode 100644 index 0000000..9e06085 --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md @@ -0,0 +1 @@ +1. no findings. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index 70160f4..4a8c3fa 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -65,6 +65,17 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-22T16:16:15.516-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "review-2", + "outcome": "clear", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md" + } } ], "feedback": [ @@ -76,7 +87,7 @@ ], "final_decision": null, "final_reason": null, - "next_step": "review-2", + "next_step": "final-review", "review_findings": null, "revision_base_step_counts": [ 5 @@ -84,5 +95,5 @@ "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-22T16:13:31.328-07:00" + "updated_at": "2026-05-22T16:16:15.516-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 17f8106..0c077f2 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "contract-regression-net", "human_review_reason": null, - "last_touch": "2026-05-22T16:13:31.327-07:00", + "last_touch": "2026-05-22T16:16:15.515-07:00", "name": "random-files-20260521T141425", "phases": [ { From a837da86389ff285a6fa625214f934c264e9a94c Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Fri, 22 May 2026 16:16:47 -0700 Subject: [PATCH 21/41] continuous refactor: planning/random-files-20260521T141425/final-review Why: safe automated execution until any interface delta, which is correctly gated for explicit human review --- .../.planning/stages/final-review.stdout.md | 11 +++++++++++ .../.planning/state.json | 19 +++++++++++++++---- .../manifest.json | 8 ++++---- 3 files changed, 30 insertions(+), 8 deletions(-) create mode 100644 migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md new file mode 100644 index 0000000..c077f0e --- /dev/null +++ b/migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md @@ -0,0 +1,11 @@ +Assessment: +- Safe for full auto execution: `No` end-to-end, because it intentionally includes a conditional interface-change gate that must stop for human judgment. +- Human review required: `Yes`, but only at the explicit conditional point in Phase 3 when an interface delta is found. +- Fundamentally flawed: `No`. The sequencing is sane, risk-first, and aligned with harness invariants. + +Checks against your criteria: +- Effort tiers are lowest safe: `low` (Phase 1), `medium` (Phase 2), `high` (Phase 3 gate-critical) all look justified and not inflated. +- Preconditions avoid baseline/fresh-validation coupling: correct. Preconditions are phase-state/scope/delta based; validation is in DoD where it belongs. +- Automation safety: good guardrails via contract inventory first, then constrained cleanup, then explicit human gate for interface shifts. + +final-decision: approve-needs-human — safe automated execution until any interface delta, which is correctly gated for explicit human review diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json index 4a8c3fa..d2e8fbb 100644 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ b/migrations/random-files-20260521T141425/.planning/state.json @@ -76,6 +76,17 @@ "outputs": { "stdout": "migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-22T16:16:46.988-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "final-review", + "outcome": "approve-needs-human", + "outputs": { + "stdout": "migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md" + } } ], "feedback": [ @@ -85,9 +96,9 @@ "text": "Fix review-2 findings: move phase-1-contract-inventory.md out of Phase 1 precondition\n and into Definition of Done; tighten Phase 1 scope wording; make Phase 3 required_effort match the\n actual gating risk with a concrete effort_reason." } ], - "final_decision": null, - "final_reason": null, - "next_step": "final-review", + "final_decision": "approve-needs-human", + "final_reason": "safe automated execution until any interface delta, which is correctly gated for explicit human review", + "next_step": "terminal-ready-awaiting-human", "review_findings": null, "revision_base_step_counts": [ 5 @@ -95,5 +106,5 @@ "schema_version": 1, "started_at": "2026-05-21T14:14:26.016-07:00", "target": "random files", - "updated_at": "2026-05-22T16:16:15.516-07:00" + "updated_at": "2026-05-22T16:16:46.988-07:00" } diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 0c077f2..91b2e63 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -1,10 +1,10 @@ { - "awaiting_human_review": false, + "awaiting_human_review": true, "cooldown_until": null, "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "contract-regression-net", - "human_review_reason": null, - "last_touch": "2026-05-22T16:16:15.515-07:00", + "human_review_reason": "safe automated execution until any interface delta, which is correctly gated for explicit human review", + "last_touch": "2026-05-22T16:16:46.990-07:00", "name": "random-files-20260521T141425", "phases": [ { @@ -32,6 +32,6 @@ "required_effort": "high" } ], - "status": "planning", + "status": "ready", "wake_up_on": null } From 04014e8423fb07354a76e833c1e05728411c8695 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sat, 23 May 2026 22:12:47 -0700 Subject: [PATCH 22/41] fix up one migration --- migrations/random-files-20260521T141425/manifest.json | 6 +++--- .../phase-1-contract-regression-net.md | 8 ++++++-- .../phase-2-internal-cleanup-behind-contracts.md | 4 +++- migrations/random-files-20260521T141425/plan.md | 5 +++++ 4 files changed, 17 insertions(+), 6 deletions(-) diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json index 91b2e63..0361193 100644 --- a/migrations/random-files-20260521T141425/manifest.json +++ b/migrations/random-files-20260521T141425/manifest.json @@ -1,9 +1,9 @@ { - "awaiting_human_review": true, + "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-21T14:14:26.016-07:00", "current_phase": "contract-regression-net", - "human_review_reason": "safe automated execution until any interface delta, which is correctly gated for explicit human review", + "human_review_reason": null, "last_touch": "2026-05-22T16:16:46.990-07:00", "name": "random-files-20260521T141425", "phases": [ @@ -12,7 +12,7 @@ "effort_reason": "Bounded, test-first contract capture with minimal production churn.", "file": "phase-1-contract-regression-net.md", "name": "contract-regression-net", - "precondition": "- Migration status is `in-progress` and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - Random-targeted files and their relevant behavior surfaces are still present and identifiable.", + "precondition": "- Migration status is `ready` or `in-progress`, and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - The random-targeted files for this migration still exist at the anchored paths listed in Scope.", "required_effort": "low" }, { diff --git a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md index 509c9a0..01bb2cb 100644 --- a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md +++ b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md @@ -6,12 +6,16 @@ Lock currently expected externally visible behavior for random-targeted surfaces ## Scope - Only tests under `tests/` that assert behavior of random-targeted user-facing surfaces (CLI behavior, repo-written artifacts, workflow outputs). - Create/update one inventory artifact: `phase-1-contract-inventory.md`. +- Random-targeted surfaces in this migration are anchored to: + - `src/continuous_refactoring/__main__.py` + - `tests/test_main_entrypoint.py` + - `LICENSE` - No production source edits beyond minimal changes strictly required to make missing behavior observable in tests. ## Precondition -- Migration status is `in-progress` and this phase is the manifest `current_phase`. +- Migration status is `ready` or `in-progress`, and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. -- Random-targeted files and their relevant behavior surfaces are still present and identifiable. +- The random-targeted files for this migration still exist at the anchored paths listed in Scope. ## Implementation Instructions 1. Create or update `phase-1-contract-inventory.md` with explicit contract bullets: surface, expected behavior, and asserting test location. diff --git a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md index 8e9848d..364c162 100644 --- a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md +++ b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md @@ -5,13 +5,15 @@ Simplify random-targeted internals and remove dead/redundant paths while preserv ## Scope - Only random-targeted source files selected for this migration. +- Source cleanup scope is anchored to `src/continuous_refactoring/__main__.py`. +- `tests/test_main_entrypoint.py` and `LICENSE` are contract-validation surfaces, not internal cleanup targets. - Internal readability improvements, dead-path deletion, and control-flow simplification. - No intentional change to released interfaces (CLI behavior, repo-written files, XDG/project state, migration manifest structure, or other install-visible contracts). ## Precondition - Phase 1 is complete. - `phase-1-contract-inventory.md` and its Phase 1 regression coverage are present. -- Candidate edits remain inside random-targeted migration scope. +- Candidate edits remain inside the anchored migration scope defined in Phase 1. ## Implementation Instructions 1. Remove dead branches/helpers/fallback paths that are unnecessary under current contracts. diff --git a/migrations/random-files-20260521T141425/plan.md b/migrations/random-files-20260521T141425/plan.md index 33382b5..44cbaff 100644 --- a/migrations/random-files-20260521T141425/plan.md +++ b/migrations/random-files-20260521T141425/plan.md @@ -8,6 +8,11 @@ Lock externally visible behavior for the random-file target first, then refactor 2. Phase 2 - Internal Cleanup Behind Contracts 3. Phase 3 - Interface-Shift Review Gate (conditional) +## Random-Target File Set (Verified) +- `src/continuous_refactoring/__main__.py` +- `tests/test_main_entrypoint.py` +- `LICENSE` + ## Dependencies - Phase 1 has no phase dependency. - Phase 2 depends on Phase 1 completion. From 9503db2e98da5b5455c4ca27de1f36a93a568d5f Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sat, 23 May 2026 22:21:52 -0700 Subject: [PATCH 23/41] add headers to migration list --- AGENTS.md | 2 +- README.md | 5 +- src/continuous_refactoring/cli.py | 5 ++ src/continuous_refactoring/migration_cli.py | 13 ++++ tests/test_cli_migrations.py | 66 +++++++++++++++++++-- 5 files changed, 84 insertions(+), 7 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index f9b5b8e..f90b9a7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -39,7 +39,7 @@ Treat `AGENTS.md` as part of the codebase's invariants, not documentation. A dri - Upgrade config: `continuous-refactoring upgrade` - Inspect migrations: `continuous-refactoring migration list [--status planning|ready|in-progress|skipped|done] - [--awaiting-review]` / + [--awaiting-review] [--no-headers]` / `continuous-refactoring migration doctor ` / `continuous-refactoring migration doctor --all` - Review migrations: `continuous-refactoring migration review diff --git a/README.md b/README.md index a4b6e52..cb88520 100644 --- a/README.md +++ b/README.md @@ -134,7 +134,7 @@ always run at fixed `medium` effort. | `run-once` | Single pass on one resolved target. No retry. If there is a diff and validation passes, it commits locally and prints the diffstat. | | `run` | The loop. Iterates refactor actions, retries on failure, and commits successful changes locally. Add `--focus-on-live-migrations` to bypass targeting and work only on eligible live migrations. | | `upgrade` | Checks that the global config manifest is current, rewrites it idempotently, and warns if the global taste file is stale. | -| `migration list` | Lists visible migrations. Add `--status ` or `--awaiting-review` to filter. | +| `migration list` | Lists visible migrations as TSV with headers. Add `--status ` or `--awaiting-review` to filter, or `--no-headers` for parsing. | | `migration doctor ` | Validates one visible migration's consistency. | | `migration doctor --all` | Validates every visible migration plus internal transaction state. | | `migration review ` | Starts staged review for a migration awaiting human review. Requires `--with` and `--model`; review runs at fixed internal `high` effort. | @@ -169,7 +169,7 @@ scope text as context for that target. - `init --live-migrations-dir PATH` — enables the larger-refactoring workflow for this project. The path is stored repo-relative in the project registry and created if missing. - `init --in-repo-taste [PATH]` — stores this project's taste file in the repo and remembers the repo-relative path. Defaults to `.continuous-refactoring/taste.md`; re-run `init --in-repo-taste ...` to choose a different path. -- `migration list` — shows visible migrations; `--awaiting-review` narrows to human-review handoffs. +- `migration list` — shows visible migrations as TSV with headers; `--awaiting-review` narrows to human-review handoffs and `--no-headers` keeps parser-friendly rows only. - `migration doctor ` / `migration doctor --all` — read-only consistency checks. Doctor reports problems; it does not repair them. - `migration review --with ... --model ...` — resolves an `awaiting_human_review` migration through a staged workspace at fixed internal `high` effort. - `migration refine (--message |--file ) --with ... --model ... [--show-agent-logs]` — adds user feedback to a planning or unexecuted ready migration and resumes planning through the `revise` step when reopening ready work at fixed internal `high` effort. @@ -184,6 +184,7 @@ Canonical migration commands: continuous-refactoring migration list continuous-refactoring migration list --status planning continuous-refactoring migration list --awaiting-review +continuous-refactoring migration list --no-headers continuous-refactoring migration doctor continuous-refactoring migration doctor --all continuous-refactoring migration review --with codex --model gpt-5 diff --git a/src/continuous_refactoring/cli.py b/src/continuous_refactoring/cli.py index 1f8a130..ec347c1 100644 --- a/src/continuous_refactoring/cli.py +++ b/src/continuous_refactoring/cli.py @@ -284,6 +284,11 @@ def _add_migration_parser(subparsers: argparse._SubParsersAction) -> None: action="store_true", help="Only show migrations awaiting human review.", ) + list_parser.add_argument( + "--no-headers", + action="store_true", + help="Omit the TSV header row.", + ) doctor_parser = migration_sub.add_parser( "doctor", diff --git a/src/continuous_refactoring/migration_cli.py b/src/continuous_refactoring/migration_cli.py index 6e3f3a5..000cb79 100644 --- a/src/continuous_refactoring/migration_cli.py +++ b/src/continuous_refactoring/migration_cli.py @@ -46,6 +46,17 @@ _MIGRATION_USAGE = "Usage: continuous-refactoring migration {list,doctor,review,refine}" _MIGRATION_MANUAL_AGENT_EFFORT: EffortTier = "high" _MISSING_TEXT = "(none)" +_LIST_HEADER = "\t".join( + ( + "slug", + "status", + "cursor", + "awaiting_review", + "last_touch", + "cooldown", + "reason", + ) +) @dataclass(frozen=True) @@ -76,6 +87,8 @@ def handle_migration(args: argparse.Namespace) -> None: def handle_migration_list(args: argparse.Namespace) -> None: context = _resolve_context(error_code=1) + if not bool(getattr(args, "no_headers", False)): + print(_LIST_HEADER) if not context.live_dir.is_dir(): return diff --git a/tests/test_cli_migrations.py b/tests/test_cli_migrations.py index cc390be..4e01ac3 100644 --- a/tests/test_cli_migrations.py +++ b/tests/test_cli_migrations.py @@ -39,6 +39,7 @@ ) _CREATED = "2025-01-01T00:00:00+00:00" +_LIST_HEADER = "slug\tstatus\tcursor\tawaiting_review\tlast_touch\tcooldown\treason" _PHASE = PhaseSpec( name="setup", file="phase-1-setup.md", @@ -54,12 +55,21 @@ def test_migration_parser_accepts_list_and_doctor() -> None: assert list_args.command == "migration" assert list_args.migration_command == "list" assert list_args.handler.__name__ == "handle_migration" + assert list_args.no_headers is False filtered = parser.parse_args( - ["migration", "list", "--status", "planning", "--awaiting-review"] + [ + "migration", + "list", + "--status", + "planning", + "--awaiting-review", + "--no-headers", + ] ) assert filtered.status == "planning" assert filtered.awaiting_review is True + assert filtered.no_headers is True doctor_args = parser.parse_args(["migration", "doctor", "my-mig"]) assert doctor_args.migration_command == "doctor" @@ -172,6 +182,7 @@ def test_documented_migration_commands_match_parser() -> None: "continuous-refactoring migration list", "continuous-refactoring migration list --status planning", "continuous-refactoring migration list --awaiting-review", + "continuous-refactoring migration list --no-headers", "continuous-refactoring migration doctor ", "continuous-refactoring migration doctor --all", ( @@ -366,6 +377,15 @@ def test_migration_list_includes_planning_ready_review_and_done_statuses( lines = [line.split("\t") for line in capsys.readouterr().out.splitlines()] assert lines == [ + [ + "slug", + "status", + "cursor", + "awaiting_review", + "last_touch", + "cooldown", + "reason", + ], [ "done-mig", "done", @@ -411,11 +431,44 @@ def test_migration_list_filters_by_status_and_awaiting_review( handle_migration_list(_list_args(status="ready", awaiting_review=True)) assert capsys.readouterr().out.splitlines() == [ + _LIST_HEADER, "ready-review\tready\tphase-1-setup.md\tyes\t" f"{_CREATED}\t(none)\t(none)" ] +def test_migration_list_no_headers_preserves_parseable_rows( + tmp_path: Path, + monkeypatch: pytest.MonkeyPatch, + capsys: pytest.CaptureFixture[str], +) -> None: + _repo, live_dir = _init_migration_project(tmp_path, monkeypatch) + _write_migration(live_dir, "ready-normal") + + handle_migration_list(_list_args(no_headers=True)) + + assert capsys.readouterr().out.splitlines() == [ + "ready-normal\tready\tphase-1-setup.md\tno\t" + f"{_CREATED}\t(none)\t(none)" + ] + + +def test_migration_list_headers_for_empty_results( + tmp_path: Path, + monkeypatch: pytest.MonkeyPatch, + capsys: pytest.CaptureFixture[str], +) -> None: + _repo, _live_dir = _init_migration_project(tmp_path, monkeypatch) + + handle_migration_list(_list_args()) + + assert capsys.readouterr().out == f"{_LIST_HEADER}\n" + + handle_migration_list(_list_args(no_headers=True)) + + assert capsys.readouterr().out == "" + + def test_migration_list_marks_invalid_planning_state_as_blocked( tmp_path: Path, monkeypatch: pytest.MonkeyPatch, @@ -429,7 +482,7 @@ def test_migration_list_marks_invalid_planning_state_as_blocked( state_path.parent.mkdir(parents=True) state_path.write_text("{not json\n", encoding="utf-8") - handle_migration_list(_list_args()) + handle_migration_list(_list_args(no_headers=True)) fields = capsys.readouterr().out.strip().split("\t") assert fields[0:3] == ["planning-mig", "planning", "planning:blocked"] @@ -452,7 +505,7 @@ def fail_resolve(_manifest: MigrationManifest) -> PhaseSpec: fail_resolve, ) - handle_migration_list(_list_args()) + handle_migration_list(_list_args(no_headers=True)) fields = capsys.readouterr().out.strip().split("\t") assert fields[0:3] == ["ready-mig", "ready", "blocked"] @@ -1576,8 +1629,13 @@ def _list_args( *, status: str | None = None, awaiting_review: bool = False, + no_headers: bool = False, ) -> argparse.Namespace: - return argparse.Namespace(status=status, awaiting_review=awaiting_review) + return argparse.Namespace( + status=status, + awaiting_review=awaiting_review, + no_headers=no_headers, + ) def _doctor_args( From 11401bff66d3451add69a452089b288b22f9376b Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sat, 23 May 2026 23:41:52 -0700 Subject: [PATCH 24/41] remove weird migration --- .../.planning/stages/approaches.stdout.md | 7 -- .../.planning/stages/expand.stdout.md | 16 ---- .../.planning/stages/final-review.stdout.md | 5 - .../.planning/stages/pick-best.stdout.md | 13 --- .../.planning/stages/review-2.stdout.md | 1 - .../.planning/stages/review.stdout.md | 5 - .../.planning/stages/revise.stdout.md | 11 --- .../.planning/state.json | 91 ------------------- .../approaches/effort-engine-consolidation.md | 48 ---------- .../approaches/interface-first-hardening.md | 49 ---------- .../approaches/test-first-boundary-pruning.md | 47 ---------- .../manifest.json | 37 -------- .../phase-1-boundary-contract-guardrails.md | 37 -------- ...se-2-internal-effort-resolution-cleanup.md | 39 -------- ...pr-title-policy-adjustment-review-gated.md | 36 -------- .../random-files-20260521T140910/plan.md | 43 --------- 16 files changed, 485 deletions(-) delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/review.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md delete mode 100644 migrations/random-files-20260521T140910/.planning/state.json delete mode 100644 migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md delete mode 100644 migrations/random-files-20260521T140910/approaches/interface-first-hardening.md delete mode 100644 migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md delete mode 100644 migrations/random-files-20260521T140910/manifest.json delete mode 100644 migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md delete mode 100644 migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md delete mode 100644 migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md delete mode 100644 migrations/random-files-20260521T140910/plan.md diff --git a/migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md deleted file mode 100644 index 7527bb8..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md +++ /dev/null @@ -1,7 +0,0 @@ -Created 3 approach files in the staged migration workspace: - -- `/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T140910/20260521T140750-956576-5c67127a69fe45e682e56b790053ca95/work/random-files-20260521T140910/approaches/interface-first-hardening.md` -- `/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T140910/20260521T140750-956576-5c67127a69fe45e682e56b790053ca95/work/random-files-20260521T140910/approaches/effort-engine-consolidation.md` -- `/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T140910/20260521T140750-956576-5c67127a69fe45e682e56b790053ca95/work/random-files-20260521T140910/approaches/test-first-boundary-pruning.md` - -Each includes strategy, tradeoffs, phased plan with `required_effort`, and risk profile, aligned to the injected taste and targeted files. diff --git a/migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md deleted file mode 100644 index 0fc425d..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md +++ /dev/null @@ -1,16 +0,0 @@ -Created in the staged migration workspace: - -- `plan.md` -- `phase-1-boundary-contract-guardrails.md` -- `phase-2-internal-effort-resolution-cleanup.md` -- `phase-3-pr-title-policy-adjustment-review-gated.md` - -What’s included: -- Numbered phases with ordering rationale and dependency graph. -- Phase-local preconditions (no baseline-green restatements). -- Independent `## Definition of Done` and `## Validation Steps` per phase. -- `required_effort` + `effort_reason` on higher-risk phases only: - - Phase 2: `medium` - - Phase 3: `high` - -No files were written outside the staged work directory. diff --git a/migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md deleted file mode 100644 index ecd10a3..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md +++ /dev/null @@ -1,5 +0,0 @@ -Safety: mostly safe, but not fully auto-safe end-to-end because Phase 3 is an intentional user-facing contract change and correctly demands explicit review. -Human decision points: yes, Phase 3 has a concrete human gate and clear impact framing requirements. -Flaw check: no fundamental flaw; effort tiers are lowest safe (`low`/`medium`/`high`), and preconditions avoid baseline/fresh-validation requirements (validation is only in DoD, which is correct). - -final-decision: approve-needs-human — phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated diff --git a/migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md deleted file mode 100644 index 4369b4d..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md +++ /dev/null @@ -1,13 +0,0 @@ -**Choice: `interface-first-hardening`** - -It has the best risk/clarity/verifiability balance for this repo right now. - -- Lowest boundary risk: it explicitly protects CLI effort semantics, prompt contracts, and workflow behavior before internal cleanup. -- Best taste match: “safer compatibility paths for shipped systems” and “treat released package interfaces as human-review territory” are first-class in this approach. -- Incremental and testable: Phase 1 is low-effort, outcome-focused guardrails; Phase 2 refactors behind stable contracts; Phase 3 makes interface changes optional and review-gated. - -**Why not the others (short):** -- `effort-engine-consolidation`: cleaner internals, but higher drift risk early; it optimizes maintainability before boundary safety. -- `test-first-boundary-pruning`: good hygiene, but contract-vs-incidental pruning can create reviewer ambiguity and accidental guard loss. - -**Suggested tie-break framing:** choose safety-first contract locking when interfaces are active and shipped; optimize internals second. diff --git a/migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md deleted file mode 100644 index 9e06085..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md +++ /dev/null @@ -1 +0,0 @@ -1. no findings. diff --git a/migrations/random-files-20260521T140910/.planning/stages/review.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/review.stdout.md deleted file mode 100644 index f82d1cb..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/review.stdout.md +++ /dev/null @@ -1,5 +0,0 @@ -1. **Phase 2 precondition improperly includes harness-owned validation state.** -`phase-2-internal-effort-resolution-cleanup.md` says: “Contract-guarding tests … are present and passing locally.” That’s effectively a baseline-green/fresh-validation precondition and should not be start-gating. Keep it in validation/DoD, not preconditions. - -2. **No other plan-quality violations found.** -Phase ordering and dependencies are coherent and risk-minimizing; each phase is independently shippable/verifiable; preconditions vs Definition of Done are mostly separated correctly; effort labels are valid (`medium`, `high`) with useful reasons; scope stays within migration-planned files; and the plan aligns with injected taste (compatibility-first interface handling, explicit review for interface changes). diff --git a/migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md b/migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md deleted file mode 100644 index f2af52e..0000000 --- a/migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md +++ /dev/null @@ -1,11 +0,0 @@ -Updated the migration planning artifacts in the staged work dir, using the chosen approach (`interface-first-hardening`) and addressing the review finding. - -Created/updated: -- `plan.md` -- `phase-1-boundary-contract-guardrails.md` -- `phase-2-internal-effort-resolution-cleanup.md` -- `phase-3-pr-title-policy-adjustment-review-gated.md` - -Key fix applied: -- In `phase-2-internal-effort-resolution-cleanup.md`, the precondition no longer requires tests to be “passing locally” (harness-owned baseline validation). -- It now correctly uses a phase-local precondition: Phase 1 guardrail tests must still exist, while pass/fail remains in validation/Definition of Done. diff --git a/migrations/random-files-20260521T140910/.planning/state.json b/migrations/random-files-20260521T140910/.planning/state.json deleted file mode 100644 index c961d4f..0000000 --- a/migrations/random-files-20260521T140910/.planning/state.json +++ /dev/null @@ -1,91 +0,0 @@ -{ - "completed_steps": [ - { - "agent": "codex", - "completed_at": "2026-05-21T14:10:09.546-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "approaches", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/approaches.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:10:25.299-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "pick-best", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/pick-best.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:11:37.820-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "expand", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/expand.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:12:14.638-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "review", - "outcome": "findings", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:12:46.368-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "revise", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/revise.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:13:33.754-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "review-2", - "outcome": "clear", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/review-2.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:14:11.163-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "final-review", - "outcome": "approve-needs-human", - "outputs": { - "stdout": "migrations/random-files-20260521T140910/.planning/stages/final-review.stdout.md" - } - } - ], - "feedback": [], - "final_decision": "approve-needs-human", - "final_reason": "phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated", - "next_step": "terminal-ready-awaiting-human", - "review_findings": "migrations/random-files-20260521T140910/.planning/stages/review.stdout.md", - "revision_base_step_counts": [], - "schema_version": 1, - "started_at": "2026-05-21T14:09:10.762-07:00", - "target": "random files", - "updated_at": "2026-05-21T14:14:11.163-07:00" -} diff --git a/migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md b/migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md deleted file mode 100644 index 73431de..0000000 --- a/migrations/random-files-20260521T140910/approaches/effort-engine-consolidation.md +++ /dev/null @@ -1,48 +0,0 @@ -# Effort Engine Consolidation - -## Strategy -Make `effort.py` the clear single source of truth for effort resolution rules, then prune duplicate intent in callers/tests. Drive refactor from pure-function invariants. - -## Why this path -- Best when current pain is cognitive load around effort tiers/capping and phase-required behavior. -- Aligns with taste: small abstractions, strong boundaries, delete stale paths. - -## Tradeoffs -- Pros: Cleaner model, easier future migration/phase scheduling changes. -- Cons: Medium chance of subtle behavioral drift if invariants are incomplete. - -## Estimated phases - -### Phase 1: Encode invariants as tests -- Scope: `tests/test_loop_migration_tick.py` (+ adjacent effort/migration tests if needed) -- Work: - - Add matrix-style assertions for default/requested/required/capped combinations. - - Verify deferred phase behavior when `required_effort` exceeds run cap. -- required_effort: `medium` - -### Phase 2: Consolidate effort resolution code paths -- Scope: `src/continuous_refactoring/effort.py` -- Work: - - Unify resolution construction paths around one internal normalization flow. - - Keep exported API stable (`EffortBudget`, `EffortResolution`, helpers). - - Preserve module-boundary error translation style. -- required_effort: `medium` - -### Phase 3: Prune callsite complexity and dead checks -- Scope: `tests/test_prompts.py`, `src/continuous_refactoring/__main__.py` -- Work: - - Remove stale assertions/workarounds that duplicate `effort.py` guarantees. - - Keep only load-bearing contract tests. -- required_effort: `low` - -## Risk profile -- Overall: **Medium** -- Main risks: - - Regressing cap semantics in edge combinations. - - Over-pruning tests that guard behavior indirectly. -- Mitigations: - - Build exhaustive tier-order checks first. - - Keep API and error messages stable unless explicitly reviewed. - -## Best fit conditions -Pick this if maintainability of effort logic is the dominant objective. diff --git a/migrations/random-files-20260521T140910/approaches/interface-first-hardening.md b/migrations/random-files-20260521T140910/approaches/interface-first-hardening.md deleted file mode 100644 index 2bc8579..0000000 --- a/migrations/random-files-20260521T140910/approaches/interface-first-hardening.md +++ /dev/null @@ -1,49 +0,0 @@ -# Interface-First Hardening - -## Strategy -Stabilize and clarify externally visible behavior first, then tighten internals with tests guarding contracts. Focus on behavior that users feel: CLI effort semantics, prompt contract strings, and PR-title policy. - -## Why this path -- Best when regression risk at boundaries is the primary concern. -- Aligns with taste: preserve compatibility for shipped interfaces and surface behavior changes explicitly for human review. - -## Tradeoffs -- Pros: Lowest risk of accidental CLI/workflow breakage; strong confidence from boundary tests. -- Cons: Some internal cleanup is deferred; may keep minor internal duplication for now. - -## Estimated phases - -### Phase 1: Lock boundary behavior with targeted tests -- Scope: `tests/test_prompts.py`, `tests/test_loop_migration_tick.py`, `.github/workflows/pr-title.yml` -- Work: - - Add/adjust outcome-based tests around effort-capped migration ticking and planning gating. - - Add prompt-contract assertions only where behavior is load-bearing (taste injection, staged/live dir constraints). - - Validate PR title regex edge cases with fixture-like checks in workflow script block (no contract change yet). -- required_effort: `low` - -### Phase 2: Refactor internals behind unchanged contracts -- Scope: `src/continuous_refactoring/effort.py`, `src/continuous_refactoring/__main__.py` -- Work: - - Remove tiny internal repetition in effort resolution using small pure helpers. - - Keep CLI-visible semantics identical (`low` default, `xhigh` cap, cap behavior). - - Keep `__main__` minimal; only touch if clarity gain is concrete. -- required_effort: `medium` - -### Phase 3: Optional boundary behavior adjustment (human review) -- Scope: `.github/workflows/pr-title.yml`, related tests/docs if needed -- Work: - - If changing title policy, explicitly name user-facing impact in review prompt and migration notes. - - Update examples/messages to match exact accepted syntax. -- required_effort: `high` - -## Risk profile -- Overall: **Low** -- Main risks: - - Hidden coupling in prompt text expectations causing brittle tests. - - Workflow regex changes silently rejecting valid PRs. -- Mitigations: - - Keep regex changes isolated and example-backed. - - Prefer additive tests before edits. - -## Best fit conditions -Pick this if the goal is reliability and safe incremental cleanup under active usage. diff --git a/migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md b/migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md deleted file mode 100644 index 063e45b..0000000 --- a/migrations/random-files-20260521T140910/approaches/test-first-boundary-pruning.md +++ /dev/null @@ -1,47 +0,0 @@ -# Test-First Boundary Pruning - -## Strategy -Start from brittle/high-noise tests and workflow checks, reduce assertion noise to behavior-centric coverage, then simplify production code only where tests prove redundancy. - -## Why this path -- Best when suite maintenance cost is rising and prompt/workflow assertions are noisy. -- Aligns with taste: outcome-focused testing, minimal mocks, remove dead/fallback structure. - -## Tradeoffs -- Pros: Faster future iteration, clearer failures, less incidental coupling to wording. -- Cons: Requires discipline to avoid deleting guards that protect true interface contracts. - -## Estimated phases - -### Phase 1: Classify tests by contract vs incidental text -- Scope: `tests/test_prompts.py`, `tests/test_loop_migration_tick.py` -- Work: - - Tag assertions as interface-critical or implementation-detail. - - Rewrite detail-coupled checks into outcome-focused checks. -- required_effort: `low` - -### Phase 2: Prune and tighten boundary checks -- Scope: `.github/workflows/pr-title.yml`, tests above -- Work: - - Keep PR-title rule strict but simplify validation messaging/tests for clarity. - - Ensure prompt tests verify required clauses without overfitting exact prose. -- required_effort: `medium` - -### Phase 3: Opportunistic code cleanup proven by tests -- Scope: `src/continuous_refactoring/effort.py`, `src/continuous_refactoring/__main__.py` -- Work: - - Delete tiny dead branches/helpers shown redundant by updated tests. - - Keep module boundaries and public behavior unchanged. -- required_effort: `medium` - -## Risk profile -- Overall: **Medium-Low** -- Main risks: - - False confidence if pruning removes high-signal assertions. - - Reviewer disagreement on what counts as contract text. -- Mitigations: - - Keep explicit list of must-preserve interface clauses. - - Route any contract relaxation through human review language. - -## Best fit conditions -Pick this if test signal-to-noise and maintenance speed are the biggest pain. diff --git a/migrations/random-files-20260521T140910/manifest.json b/migrations/random-files-20260521T140910/manifest.json deleted file mode 100644 index f7e939a..0000000 --- a/migrations/random-files-20260521T140910/manifest.json +++ /dev/null @@ -1,37 +0,0 @@ -{ - "awaiting_human_review": true, - "cooldown_until": null, - "created_at": "2026-05-21T14:09:10.762-07:00", - "current_phase": "boundary-contract-guardrails", - "human_review_reason": "phases 1-2 are automation-safe, but phase 3 changes PR-title interface semantics and must stay review-gated", - "last_touch": "2026-05-21T14:14:11.165-07:00", - "name": "random-files-20260521T140910", - "phases": [ - { - "done": false, - "effort_reason": null, - "file": "phase-1-boundary-contract-guardrails.md", - "name": "boundary-contract-guardrails", - "precondition": "- No earlier phase in this migration is incomplete. - Files/symbols defining current effort routing, migration ticking/planning gating, prompt taste injection, and PR title validation still exist and are reachable from tests. - The worktree contains no unrelated in-flight edits to the same boundary files that would make observed behavior ambiguous.", - "required_effort": null - }, - { - "done": false, - "effort_reason": "Cross-module effort-resolution cleanup can silently drift CLI/migration behavior without careful contract-preserving verification.", - "file": "phase-2-internal-effort-resolution-cleanup.md", - "name": "internal-effort-resolution-cleanup", - "precondition": "- Phase 1 is complete. - Contract-guarding tests for boundary behavior added in Phase 1 are still present. - Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests.", - "required_effort": "medium" - }, - { - "done": false, - "effort_reason": "PR title policy changes are user-facing workflow contract changes and require careful compatibility framing and explicit review communication.", - "file": "phase-3-pr-title-policy-adjustment-review-gated.md", - "name": "pr-title-policy-adjustment-review-gated", - "precondition": "- Phase 1 and Phase 2 are complete. - There is a concrete, documented reason for policy change (not cleanup-only churn). - The current accepted/rejected title behavior is captured by tests so change impact is measurable.", - "required_effort": "high" - } - ], - "status": "ready", - "wake_up_on": null -} diff --git a/migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md b/migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md deleted file mode 100644 index 0767dd5..0000000 --- a/migrations/random-files-20260521T140910/phase-1-boundary-contract-guardrails.md +++ /dev/null @@ -1,37 +0,0 @@ -# Phase 1: Boundary Contract Guardrails - -## Scope -- `tests/test_prompts.py` -- `tests/test_loop_migration_tick.py` -- `.github/workflows/pr-title.yml` (tests/fixtures/assertions only; no policy change in this phase) -- Any directly related existing tests that validate the same boundary contracts - -## Goals -- Lock current interface behavior with outcome-focused tests before internal refactors. -- Increase confidence around effort-cap behavior and planning gating behavior. -- Make PR title policy edge behavior explicit in testable checks without changing policy semantics. - -## Precondition -- No earlier phase in this migration is incomplete. -- Files/symbols defining current effort routing, migration ticking/planning gating, prompt taste injection, and PR title validation still exist and are reachable from tests. -- The worktree contains no unrelated in-flight edits to the same boundary files that would make observed behavior ambiguous. - -## Implementation Instructions -1. Strengthen/extend tests for migration ticking and planning gating behavior so they assert outcomes (eligibility/deferral/routing), not internal call shapes. -2. Strengthen/extend prompt contract tests only for load-bearing invariants (including Taste section and staged/live planning constraints where already contractual). -3. Add explicit PR-title edge-case checks in workflow-adjacent test coverage/fixtures or equivalent deterministic assertions, while preserving current acceptance behavior. -4. Keep changes small and focused on guardrails; do not refactor production logic in this phase unless needed to enable deterministic testing. - -## Validation Steps -- Run targeted checks first: - - `uv run pytest tests/test_prompts.py` - - `uv run pytest tests/test_loop_migration_tick.py` - - `uv run pytest -k "pr title or pr_title"` -- Run full validation command: - - `uv run pytest` - -## Definition of Done -- Boundary tests covering effort-capped migration/planning behavior and prompt contract invariants are present, deterministic, and passing. -- PR title policy behavior is more explicitly exercised without changing acceptance semantics. -- Full configured validation command passes. -- Repository remains shippable with unchanged external interface behavior. diff --git a/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md b/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md deleted file mode 100644 index 336c4c6..0000000 --- a/migrations/random-files-20260521T140910/phase-2-internal-effort-resolution-cleanup.md +++ /dev/null @@ -1,39 +0,0 @@ -# Phase 2: Internal Effort Resolution Cleanup - -required_effort: medium -effort_reason: Cross-module effort-resolution cleanup can silently drift CLI/migration behavior without careful contract-preserving verification. - -## Scope -- `src/continuous_refactoring/effort.py` -- Minimal adjacent call sites if required for coherence (for example `loop.py`, `migration_tick.py`, or CLI argument plumbing) -- `src/continuous_refactoring/__main__.py` only if there is a concrete readability gain with zero behavior drift -- Tests that validate effort defaults/caps and run semantics - -## Goals -- Reduce internal repetition and improve readability in effort resolution paths. -- Preserve all externally visible effort semantics exactly. - -## Precondition -- Phase 1 is complete. -- Contract-guarding tests for boundary behavior added in Phase 1 are still present. -- Current effort interfaces (default `low`, cap `xhigh`, target override cap behavior, migration defer-on-over-cap behavior) still exist and are encoded in tests. - -## Implementation Instructions -1. Introduce small pure helpers to centralize effort normalization/capping where duplication currently exists. -2. Keep behavior identical at boundaries: CLI defaults, cap enforcement, and phase deferral semantics must not change. -3. Keep abstraction depth shallow; prefer direct, readable flow over framework-like layering. -4. Only touch `__main__.py` if it materially reduces ambiguity without changing invocation behavior. - -## Validation Steps -- Run focused effort/CLI/run tests: - - `uv run pytest tests/test_effort.py` - - `uv run pytest tests/test_cli.py tests/test_run.py tests/test_run_once.py` - - `uv run pytest tests/test_loop_migration_tick.py` -- Run full validation command: - - `uv run pytest` - -## Definition of Done -- Internal effort-resolution logic is simpler (less duplication / clearer flow) with no interface drift. -- Existing boundary tests from Phase 1 remain green without contract assertion changes that weaken coverage. -- Full configured validation command passes. -- Repository remains shippable with unchanged user-visible effort behavior. diff --git a/migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md b/migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md deleted file mode 100644 index bae7667..0000000 --- a/migrations/random-files-20260521T140910/phase-3-pr-title-policy-adjustment-review-gated.md +++ /dev/null @@ -1,36 +0,0 @@ -# Phase 3: PR Title Policy Adjustment (Review-Gated) - -required_effort: high -effort_reason: PR title policy changes are user-facing workflow contract changes and require careful compatibility framing and explicit review communication. - -## Scope -- `.github/workflows/pr-title.yml` -- Any directly related tests/fixtures/docs that define accepted PR title patterns -- Migration notes/review prompt content that explains interface impact - -## Goals -- Apply a deliberate PR title policy behavior adjustment only if needed. -- Make user-facing impact explicit, concrete, and review-friendly. - -## Precondition -- Phase 1 and Phase 2 are complete. -- There is a concrete, documented reason for policy change (not cleanup-only churn). -- The current accepted/rejected title behavior is captured by tests so change impact is measurable. - -## Implementation Instructions -1. Change PR-title matching behavior only for the explicitly intended cases. -2. Update/add tests to show before/after expectations for affected title examples. -3. Update any user-facing examples/messages so accepted syntax is unambiguous. -4. In review-facing notes, explicitly name the interface behavior change (what titles now pass/fail) and why. - -## Validation Steps -- Run PR-title focused checks: - - `uv run pytest -k "pr title or pr_title or workflow"` -- Run full validation command: - - `uv run pytest` - -## Definition of Done -- PR title policy change is intentional, narrowly scoped, and fully covered by deterministic tests. -- Review notes explicitly describe the interface behavior change and expected user impact. -- Full configured validation command passes. -- Repository remains shippable with workflow behavior updated in a clearly documented, review-gated way. diff --git a/migrations/random-files-20260521T140910/plan.md b/migrations/random-files-20260521T140910/plan.md deleted file mode 100644 index ae8ac10..0000000 --- a/migrations/random-files-20260521T140910/plan.md +++ /dev/null @@ -1,43 +0,0 @@ -# Migration Plan: Interface-First Hardening - -## Objective -Harden external contracts first (CLI effort semantics, prompt contract invariants, PR title policy behavior), then clean internals behind those contracts, with any user-visible policy shift isolated and human-review gated. - -## Phase Overview -1. **Phase 1 — Boundary Contract Guardrails** (`phase-1-boundary-contract-guardrails.md`) -2. **Phase 2 — Internal Effort Resolution Cleanup** (`phase-2-internal-effort-resolution-cleanup.md`) -3. **Phase 3 — PR Title Policy Adjustment (Review-Gated)** (`phase-3-pr-title-policy-adjustment-review-gated.md`) - -## Dependency Graph -```mermaid -graph TD - P1[Phase 1: Boundary Contract Guardrails] --> P2[Phase 2: Internal Effort Resolution Cleanup] - P1 --> P3[Phase 3: PR Title Policy Adjustment (Review-Gated)] - P2 --> P3 -``` - -## Why this ordering -- Phase 1 reduces regression risk by making interface expectations executable before any internal movement. -- Phase 2 then refactors internals under locked behavior so cleanup is low-risk and easy to verify. -- Phase 3 is explicitly optional and last because it can alter user-facing PR workflow semantics and should only proceed with intentional review context. - -## Validation Strategy -- Baseline contract is enforced by the harness before refactoring and after each completed phase. -- Each phase also includes targeted, independently runnable checks for its own scope. -- Every phase requires the configured full validation command to pass before completion. - -Validation commands used by phases: -- `uv run pytest` -- `uv run pytest tests/test_prompts.py` -- `uv run pytest tests/test_loop_migration_tick.py` -- `uv run pytest tests/test_effort.py tests/test_cli.py tests/test_run.py tests/test_run_once.py` (or nearest equivalent files if names differ) - -## Interface Risk Management -- Treat CLI behavior, prompt contracts, migration-planning constraints, and PR title policy as interface surfaces. -- Any behavior change to these surfaces must be called out explicitly in phase notes and human review prompts. -- Keep compatibility-first defaults unless the phase explicitly targets a behavior change. - -## Shippability bar per phase -- Repository remains releasable after each phase. -- No partial contract rewrites without matching tests. -- No silent behavior changes at interface boundaries. From 8f18c9c613043f874972b81f1a5ff44d61c1c762 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:16:12 -0700 Subject: [PATCH 25/41] remove other migration --- .../.planning/stages/approaches.stdout.md | 7 -- .../.planning/stages/expand.stdout.md | 8 -- .../.planning/stages/final-review.stdout.md | 11 -- .../.planning/stages/pick-best.stdout.md | 17 --- .../.planning/stages/review-2.stdout.md | 1 - .../.planning/stages/review.stdout.md | 5 - .../.planning/stages/revise-2.stdout.md | 11 -- .../.planning/stages/revise.stdout.md | 14 --- .../.planning/state.json | 110 ------------------ ...ehavior-first-random-file-stabilization.md | 47 -------- .../delete-dead-paths-then-reprove.md | 47 -------- ...nimal-entrypoint-and-license-guardrails.md | 45 ------- .../manifest.json | 37 ------ .../phase-1-contract-regression-net.md | 38 ------ ...ase-2-internal-cleanup-behind-contracts.md | 36 ------ .../phase-3-interface-shift-review-gate.md | 34 ------ .../random-files-20260521T141425/plan.md | 49 -------- tests/test_main_entrypoint.py | 9 +- 18 files changed, 5 insertions(+), 521 deletions(-) delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/review.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md delete mode 100644 migrations/random-files-20260521T141425/.planning/state.json delete mode 100644 migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md delete mode 100644 migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md delete mode 100644 migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md delete mode 100644 migrations/random-files-20260521T141425/manifest.json delete mode 100644 migrations/random-files-20260521T141425/phase-1-contract-regression-net.md delete mode 100644 migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md delete mode 100644 migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md delete mode 100644 migrations/random-files-20260521T141425/plan.md diff --git a/migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md deleted file mode 100644 index bfc0b86..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md +++ /dev/null @@ -1,7 +0,0 @@ -Created 3 approach files in the staged planning workspace: - -- `approaches/minimal-entrypoint-and-license-guardrails.md` -- `approaches/behavior-first-random-file-stabilization.md` -- `approaches/delete-dead-paths-then-reprove.md` - -Each includes strategy, tradeoffs, phased plan, per-phase `required_effort`, and risk profile, aligned to the provided taste and effort constraints. diff --git a/migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md deleted file mode 100644 index d2b9bea..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md +++ /dev/null @@ -1,8 +0,0 @@ -Created the planning artifacts in the staged migration workspace: - -- `plan.md` -- `phase-1-contract-regression-net.md` -- `phase-2-internal-cleanup-behind-contracts.md` -- `phase-3-interface-shift-review-gate.md` - -They include numbered phases, dependency ordering, a validation strategy, strict `## Precondition` vs `## Definition of Done` separation, and effort tags/reasons using valid labels (`low`, `medium`, `high`). diff --git a/migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md deleted file mode 100644 index c077f0e..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md +++ /dev/null @@ -1,11 +0,0 @@ -Assessment: -- Safe for full auto execution: `No` end-to-end, because it intentionally includes a conditional interface-change gate that must stop for human judgment. -- Human review required: `Yes`, but only at the explicit conditional point in Phase 3 when an interface delta is found. -- Fundamentally flawed: `No`. The sequencing is sane, risk-first, and aligned with harness invariants. - -Checks against your criteria: -- Effort tiers are lowest safe: `low` (Phase 1), `medium` (Phase 2), `high` (Phase 3 gate-critical) all look justified and not inflated. -- Preconditions avoid baseline/fresh-validation coupling: correct. Preconditions are phase-state/scope/delta based; validation is in DoD where it belongs. -- Automation safety: good guardrails via contract inventory first, then constrained cleanup, then explicit human gate for interface shifts. - -final-decision: approve-needs-human — safe automated execution until any interface delta, which is correctly gated for explicit human review diff --git a/migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md deleted file mode 100644 index 40739c9..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md +++ /dev/null @@ -1,17 +0,0 @@ -Chosen approach: **`behavior-first-random-file-stabilization`** - -Why this is the best balance: - -- **Risk:** lower than dead-path-first, because it stabilizes externally visible behavior before cleanup and explicitly adds a human-review gate if interface shifts appear. -- **Clarity:** the phase story is clean and reviewable: prove contracts, refactor behind them, escalate only if boundaries move. -- **Incremental verifiability:** each phase has a crisp verification target and natural rollback point; Phase 1 gives immediate confidence for later deletions. - -Why not the others: - -- `delete-dead-paths-then-reprove`: strong cleanup upside, but too much upfront blast radius for a random-file migration; higher chance of uncovering hidden coupling late. -- `minimal-entrypoint-and-license-guardrails`: safest, but likely under-shoots the migration’s refactoring value unless the random set is truly only boundary trivia. - -Taste tie-break application: - -- Taste asks for **aggressive dead-code deletion** in non-shipped internals, but also **compatibility-first handling** and **human review for released interfaces**. -- This approach best reconciles both by sequencing: lock behavior first, then delete safely, then require explicit interface review if needed. diff --git a/migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md deleted file mode 100644 index 9e06085..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md +++ /dev/null @@ -1 +0,0 @@ -1. no findings. diff --git a/migrations/random-files-20260521T141425/.planning/stages/review.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/review.stdout.md deleted file mode 100644 index 13a494a..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/review.stdout.md +++ /dev/null @@ -1,5 +0,0 @@ -1. `phase-3-interface-shift-review-gate.md`: `required_effort: high` looks over-tiered for the scoped work described (documentation + gating-state correctness + validation). This phase explicitly says “keep technical changes minimal,” so `medium` is likely the lowest safe tier; keep `high` only if you expect nontrivial code-path changes during gating. - -2. `phase-1-contract-regression-net.md` precondition is partially soft: “target contracts to lock are identifiable in current code/tests.” That’s directionally right, but it’s not a concrete gate. Tighten it to an observable artifact check (for example, an explicit list of target contracts captured in phase notes/plan artifact) so start/no-start is binary. - -3. `phase-2-internal-cleanup-behind-contracts.md` precondition “helper/symbol slated for deletion has at least one surviving behavior-level test path…” mixes start-gating with implementation-time proof work. That evidence is usually established during execution, not before start. Move this to Definition of Done or validation criteria, and keep precondition to dependency/scope readiness only. diff --git a/migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md deleted file mode 100644 index 6dd07a1..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md +++ /dev/null @@ -1,11 +0,0 @@ -Updated the migration planning artifacts in the staged work dir with the requested refinements: - -- [plan.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/plan.md) -- [phase-1-contract-regression-net.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/phase-1-contract-regression-net.md) -- [phase-2-internal-cleanup-behind-contracts.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md) -- [phase-3-interface-shift-review-gate.md](/Users/hiren/.local/share/continuous-refactoring/projects/dfdec7e7-73c4-45c3-9786-693784acfd83/planning/random-files-20260521T141425/20260522T161153-731704-48db80aa86304b11a578594fd35fc409/work/random-files-20260521T141425/phase-3-interface-shift-review-gate.md) - -What changed per your feedback: -- Removed `phase-1-contract-inventory.md` from Phase 1 preconditions and kept it in Phase 1 Definition of Done. -- Tightened Phase 1 scope to explicitly bound it to behavior tests plus the single inventory artifact, with minimal production edits only when strictly needed for test observability. -- Raised Phase 3 `required_effort` to `high` and added a concrete `effort_reason` tied to release-facing interface-risk and gating correctness. diff --git a/migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md b/migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md deleted file mode 100644 index 2aadb33..0000000 --- a/migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md +++ /dev/null @@ -1,14 +0,0 @@ -Updated in the staged migration workspace: - -- `plan.md` -- `phase-1-contract-regression-net.md` -- `phase-2-internal-cleanup-behind-contracts.md` -- `phase-3-interface-shift-review-gate.md` - -Review findings addressed: - -1. Phase 3 effort lowered from `high` to `medium` with rationale aligned to minimal technical scope. -2. Phase 1 precondition tightened to a binary artifact gate: required `phase-1-contract-inventory.md`. -3. Phase 2 moved deletion-proof requirement out of precondition into validation/DoD; precondition is now dependency/scope readiness only. - -The phases remain ordered for risk reduction, independently verifiable, and shippable at each step. diff --git a/migrations/random-files-20260521T141425/.planning/state.json b/migrations/random-files-20260521T141425/.planning/state.json deleted file mode 100644 index d2e8fbb..0000000 --- a/migrations/random-files-20260521T141425/.planning/state.json +++ /dev/null @@ -1,110 +0,0 @@ -{ - "completed_steps": [ - { - "agent": "codex", - "completed_at": "2026-05-21T14:15:21.290-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "approaches", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/approaches.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:15:36.621-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "pick-best", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/pick-best.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:16:32.756-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "expand", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/expand.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:17:14.331-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "review", - "outcome": "findings", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/review.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-21T14:18:25.372-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "revise", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/revise.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-22T16:13:31.328-07:00", - "effort": "high", - "model": "gpt-5.3-codex", - "name": "revise", - "outcome": "completed", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/revise-2.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-22T16:16:15.516-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "review-2", - "outcome": "clear", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/review-2.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-22T16:16:46.988-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "final-review", - "outcome": "approve-needs-human", - "outputs": { - "stdout": "migrations/random-files-20260521T141425/.planning/stages/final-review.stdout.md" - } - } - ], - "feedback": [ - { - "received_at": "2026-05-22T16:11:53.753-07:00", - "source": "message", - "text": "Fix review-2 findings: move phase-1-contract-inventory.md out of Phase 1 precondition\n and into Definition of Done; tighten Phase 1 scope wording; make Phase 3 required_effort match the\n actual gating risk with a concrete effort_reason." - } - ], - "final_decision": "approve-needs-human", - "final_reason": "safe automated execution until any interface delta, which is correctly gated for explicit human review", - "next_step": "terminal-ready-awaiting-human", - "review_findings": null, - "revision_base_step_counts": [ - 5 - ], - "schema_version": 1, - "started_at": "2026-05-21T14:14:26.016-07:00", - "target": "random files", - "updated_at": "2026-05-22T16:16:46.988-07:00" -} diff --git a/migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md b/migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md deleted file mode 100644 index dad1982..0000000 --- a/migrations/random-files-20260521T141425/approaches/behavior-first-random-file-stabilization.md +++ /dev/null @@ -1,47 +0,0 @@ -# Behavior-First Random File Stabilization - -## Strategy -Use the random-file set to strengthen externally visible behavior first (CLI entry, workflow contracts, migration planning artifacts), then prune only internal duplication proven redundant by tests. - -## Why this path -- Aligns with shipped-interface caution in taste. -- Works well when selected files are heterogeneous and don’t justify one deep module refactor. - -## Tradeoffs -- Pros: High confidence on user-visible behavior; straightforward review story. -- Cons: Internal elegance may remain uneven after this migration. - -## Estimated phases - -### Phase 1: Snapshot current behavior with focused regression tests -- Scope: tests touching selected random-file surfaces -- Work: - - Add targeted assertions for entrypoint behavior and any touched migration/planning contract paths. - - Keep tests based on outcomes, not implementation calls. -- required_effort: `low` - -### Phase 2: Internal cleanup behind stable interfaces -- Scope: only random-targeted source files selected for this migration -- Work: - - Remove dead branches/helpers made unnecessary by current contracts. - - Keep error translation only at module boundaries. -- required_effort: `medium` - -### Phase 3: Human-review checkpoint for interface shifts (only if needed) -- Scope: any CLI, repo-written-file, or workflow contract change discovered in Phase 2 -- Work: - - Explicitly document the behavioral delta and rollout impact. - - Gate publish on review acknowledgment. -- required_effort: `high` - -## Risk profile -- Overall: **Low-Medium** -- Main risks: - - Hidden interface drift during cleanup. - - Random-file coupling surfacing late. -- Mitigations: - - Keep phase 1 tests narrow and contract-oriented. - - Escalate to review gate at first interface delta. - -## Best fit conditions -Pick this when reliability and reviewability matter more than maximal code reduction. diff --git a/migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md b/migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md deleted file mode 100644 index bec0b95..0000000 --- a/migrations/random-files-20260521T141425/approaches/delete-dead-paths-then-reprove.md +++ /dev/null @@ -1,47 +0,0 @@ -# Delete Dead Paths Then Re-prove - -## Strategy -Aggressively remove fallback/legacy code in random-targeted files, then re-prove required behavior with concise tests and boundary checks. - -## Why this path -- Matches taste preference for deleting unused paths in non-shipped internals. -- Delivers the biggest readability gain per line changed when dead code exists. - -## Tradeoffs -- Pros: Strong simplification; future maintenance gets easier quickly. -- Cons: Highest chance of exposing implicit dependencies that looked unused. - -## Estimated phases - -### Phase 1: Dead-path inventory and dependency check -- Scope: random-targeted files plus direct callers/tests -- Work: - - Identify branches/helpers/tables with no live call path. - - Confirm no external contract depends on them. -- required_effort: `medium` - -### Phase 2: Removal pass with boundary-preserving errors -- Scope: selected source files -- Work: - - Delete dead flags/shims/branches outright. - - Preserve or improve exception nesting only at module boundaries. -- required_effort: `high` - -### Phase 3: Re-proof via focused regression + integration checks -- Scope: relevant `tests/test_*.py` -- Work: - - Add/update tests for post-deletion behavior and side effects. - - Confirm full pytest gate remains green. -- required_effort: `medium` - -## Risk profile -- Overall: **Medium-High** -- Main risks: - - Removing code that encodes undocumented edge behavior. - - Larger diff in mixed-scope random files. -- Mitigations: - - Require explicit evidence of deadness before deletion. - - Keep deletions and proof tests in the same phase boundary. - -## Best fit conditions -Pick this when maintainability pain is from stale fallback logic and the team accepts moderate refactor risk. diff --git a/migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md b/migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md deleted file mode 100644 index fe9d2d0..0000000 --- a/migrations/random-files-20260521T141425/approaches/minimal-entrypoint-and-license-guardrails.md +++ /dev/null @@ -1,45 +0,0 @@ -# Minimal Entrypoint and License Guardrails - -## Strategy -Treat this as a low-blast-radius hardening pass: lock current behavior around `__main__.py` and repository license presence, then make only clarity-level cleanup that keeps interfaces unchanged. - -## Why this path -- Best when the target set is small and boundary-facing (`__main__`, `LICENSE`). -- Maximizes safety while still producing measurable cleanup. - -## Tradeoffs -- Pros: Very low regression risk; quick to validate with focused tests. -- Cons: Limited structural payoff; does not unlock larger refactors. - -## Estimated phases - -### Phase 1: Contract tests for entry invocation and package execution -- Scope: `tests/test_main_entrypoint.py` (and adjacent entrypoint tests if needed) -- Work: - - Assert `python -m continuous_refactoring` still routes through `cli.cli_main()`. - - Assert no accidental side effects at import time. -- required_effort: `low` - -### Phase 2: Entrypoint micro-cleanup with unchanged behavior -- Scope: `src/continuous_refactoring/__main__.py` -- Work: - - Keep file minimal and explicit; remove any future drift-prone boilerplate if present. - - Preserve module boundary behavior exactly. -- required_effort: `low` - -### Phase 3: Repository metadata guardrails -- Scope: `LICENSE` and tests/docs only if needed -- Work: - - Add a lightweight test/check that required license text file remains present and non-empty. - - Avoid policy changes or content rewrites unless explicitly intended. -- required_effort: `low` - -## Risk profile -- Overall: **Low** -- Main risks: - - Over-testing trivial behavior and creating brittle tests. -- Mitigations: - - Keep assertions outcome-focused and minimal. - -## Best fit conditions -Pick this when the migration goal is safe hygiene and confidence, not deeper architecture change. diff --git a/migrations/random-files-20260521T141425/manifest.json b/migrations/random-files-20260521T141425/manifest.json deleted file mode 100644 index 0361193..0000000 --- a/migrations/random-files-20260521T141425/manifest.json +++ /dev/null @@ -1,37 +0,0 @@ -{ - "awaiting_human_review": false, - "cooldown_until": null, - "created_at": "2026-05-21T14:14:26.016-07:00", - "current_phase": "contract-regression-net", - "human_review_reason": null, - "last_touch": "2026-05-22T16:16:46.990-07:00", - "name": "random-files-20260521T141425", - "phases": [ - { - "done": false, - "effort_reason": "Bounded, test-first contract capture with minimal production churn.", - "file": "phase-1-contract-regression-net.md", - "name": "contract-regression-net", - "precondition": "- Migration status is `ready` or `in-progress`, and this phase is the manifest `current_phase`. - No earlier migration phase is incomplete. - The random-targeted files for this migration still exist at the anchored paths listed in Scope.", - "required_effort": "low" - }, - { - "done": false, - "effort_reason": "Cross-file internal cleanup with contract-preserving constraints needs careful sequencing.", - "file": "phase-2-internal-cleanup-behind-contracts.md", - "name": "internal-cleanup-behind-contracts", - "precondition": "- Phase 1 is complete. - `phase-1-contract-inventory.md` and its Phase 1 regression coverage are present. - Candidate edits remain inside random-targeted migration scope.", - "required_effort": "medium" - }, - { - "done": false, - "effort_reason": "This phase controls release-facing interface risk and human-review gating correctness, so mistakes can unblock unsafe automation.", - "file": "phase-3-interface-shift-review-gate.md", - "name": "interface-shift-review-gate", - "precondition": "- Phase 2 is complete. - At least one concrete interface behavior delta is documented with reproducible before/after behavior. - The migration has not already been marked `awaiting_human_review` for the same deltas.", - "required_effort": "high" - } - ], - "status": "ready", - "wake_up_on": null -} diff --git a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md b/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md deleted file mode 100644 index 01bb2cb..0000000 --- a/migrations/random-files-20260521T141425/phase-1-contract-regression-net.md +++ /dev/null @@ -1,38 +0,0 @@ -# Phase 1: Contract Regression Net - -## Goal -Lock currently expected externally visible behavior for random-targeted surfaces before internal cleanup starts. - -## Scope -- Only tests under `tests/` that assert behavior of random-targeted user-facing surfaces (CLI behavior, repo-written artifacts, workflow outputs). -- Create/update one inventory artifact: `phase-1-contract-inventory.md`. -- Random-targeted surfaces in this migration are anchored to: - - `src/continuous_refactoring/__main__.py` - - `tests/test_main_entrypoint.py` - - `LICENSE` -- No production source edits beyond minimal changes strictly required to make missing behavior observable in tests. - -## Precondition -- Migration status is `ready` or `in-progress`, and this phase is the manifest `current_phase`. -- No earlier migration phase is incomplete. -- The random-targeted files for this migration still exist at the anchored paths listed in Scope. - -## Implementation Instructions -1. Create or update `phase-1-contract-inventory.md` with explicit contract bullets: surface, expected behavior, and asserting test location. -2. Add or tighten outcome-based regression tests for every listed contract. -3. Prefer real collaborators and existing fixtures; avoid interaction-level mocks unless boundary isolation is necessary. -4. Keep assertions strict enough to detect interface drift in scoped observable outcomes. - -## Validation Steps -1. Run focused tests that cover the listed contracts. -2. For each listed contract, intentionally break the behavior and verify the corresponding test fails. -3. Run the configured full validation command. - -## Definition of Done -- `phase-1-contract-inventory.md` exists and maps each scoped contract to concrete regression coverage. -- Every contract listed in that inventory has passing outcome-based regression coverage. -- Execution evidence shows each listed contract test fails when its protected behavior is intentionally broken. -- The full configured validation command passes. - -required_effort: low -effort_reason: Bounded, test-first contract capture with minimal production churn. diff --git a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md b/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md deleted file mode 100644 index 364c162..0000000 --- a/migrations/random-files-20260521T141425/phase-2-internal-cleanup-behind-contracts.md +++ /dev/null @@ -1,36 +0,0 @@ -# Phase 2: Internal Cleanup Behind Contracts - -## Goal -Simplify random-targeted internals and remove dead/redundant paths while preserving Phase 1 locked behavior. - -## Scope -- Only random-targeted source files selected for this migration. -- Source cleanup scope is anchored to `src/continuous_refactoring/__main__.py`. -- `tests/test_main_entrypoint.py` and `LICENSE` are contract-validation surfaces, not internal cleanup targets. -- Internal readability improvements, dead-path deletion, and control-flow simplification. -- No intentional change to released interfaces (CLI behavior, repo-written files, XDG/project state, migration manifest structure, or other install-visible contracts). - -## Precondition -- Phase 1 is complete. -- `phase-1-contract-inventory.md` and its Phase 1 regression coverage are present. -- Candidate edits remain inside the anchored migration scope defined in Phase 1. - -## Implementation Instructions -1. Remove dead branches/helpers/fallback paths that are unnecessary under current contracts. -2. Translate/wrap errors only at module boundaries; preserve signal when bubbling within a module. -3. Introduce small abstractions only when they reduce repetition or branch complexity. -4. If an interface behavior change is required, isolate and document that delta for Phase 3 instead of blending it into broad cleanup. - -## Validation Steps -1. Run targeted tests covering cleaned paths, including all Phase 1 contract tests. -2. Confirm surviving behavior-level tests still cover externally observable effects previously provided by removed helpers/symbols. -3. Run the configured full validation command. - -## Definition of Done -- Scoped internal dead/redundant paths are removed or simplified without regressing locked behavior. -- Surviving behavior-level tests cover externally observable effects previously provided by removed helpers/symbols. -- Any intentional interface delta is explicitly documented for Phase 3. -- The full configured validation command passes. - -required_effort: medium -effort_reason: Cross-file internal cleanup with contract-preserving constraints needs careful sequencing. diff --git a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md b/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md deleted file mode 100644 index 036a9e3..0000000 --- a/migrations/random-files-20260521T141425/phase-3-interface-shift-review-gate.md +++ /dev/null @@ -1,34 +0,0 @@ -# Phase 3: Interface-Shift Review Gate - -## Goal -Gate any discovered interface behavior change behind explicit, interface-specific human review before automation proceeds. - -## Scope -- Runs only if Phase 2 identifies at least one interface behavior delta. -- Documentation and migration state updates needed to communicate and enforce review gating. -- No unrelated cleanup or broad production refactor work. - -## Precondition -- Phase 2 is complete. -- At least one concrete interface behavior delta is documented with reproducible before/after behavior. -- The migration has not already been marked `awaiting_human_review` for the same deltas. - -## Implementation Instructions -1. Document each interface shift with concrete before/after behavior and user/install impact. -2. Ensure review messaging names the exact interface contract change; avoid generic "needs review" language. -3. Set and verify `awaiting_human_review` so automation remains paused until canonical migration review approval. -4. Keep this phase focused on gating correctness and communication quality. - -## Validation Steps -1. Verify review artifacts clearly and specifically describe each interface delta and impact. -2. Verify migration gating state is active and accurately tied to the documented deltas. -3. Run the configured full validation command after artifact/state updates. - -## Definition of Done -- All interface behavior changes discovered in this migration are documented with concrete impact statements. -- Human-review gating is active, explicit, and tied to the named interface deltas. -- Review messaging is interface-specific and non-generic. -- The full configured validation command passes. - -required_effort: high -effort_reason: This phase controls release-facing interface risk and human-review gating correctness, so mistakes can unblock unsafe automation. diff --git a/migrations/random-files-20260521T141425/plan.md b/migrations/random-files-20260521T141425/plan.md deleted file mode 100644 index 44cbaff..0000000 --- a/migrations/random-files-20260521T141425/plan.md +++ /dev/null @@ -1,49 +0,0 @@ -# Migration Plan: behavior-first-random-file-stabilization - -## Objective -Lock externally visible behavior for the random-file target first, then refactor internals behind that contract, and gate any interface shift behind explicit human review. - -## Phases -1. Phase 1 - Contract Regression Net -2. Phase 2 - Internal Cleanup Behind Contracts -3. Phase 3 - Interface-Shift Review Gate (conditional) - -## Random-Target File Set (Verified) -- `src/continuous_refactoring/__main__.py` -- `tests/test_main_entrypoint.py` -- `LICENSE` - -## Dependencies -- Phase 1 has no phase dependency. -- Phase 2 depends on Phase 1 completion. -- Phase 3 depends on Phase 2 completion and runs only if Phase 2 surfaces an interface behavior delta. - -## Dependency Graph -```mermaid -graph TD - P1[Phase 1: Contract Regression Net] --> P2[Phase 2: Internal Cleanup Behind Contracts] - P2 --> P3[Phase 3: Interface-Shift Review Gate (Conditional)] -``` - -## Phase Documents -- [phase-1-contract-regression-net.md](phase-1-contract-regression-net.md) -- [phase-2-internal-cleanup-behind-contracts.md](phase-2-internal-cleanup-behind-contracts.md) -- [phase-3-interface-shift-review-gate.md](phase-3-interface-shift-review-gate.md) - -## Validation Strategy -- The harness enforces configured validation before refactoring and after each completed phase. -- Each phase adds independent checks: - - Phase 1: defines a concrete contract inventory and proves each listed contract has outcome-based regression coverage. - - Phase 2: proves internal deletions/simplifications preserve the locked contracts while staying within random-target scope. - - Phase 3: proves interface-delta documentation and `awaiting_human_review` gating are explicit, correct, and actionable. -- A phase counts as complete only when its Definition of Done is satisfied and the configured validation command passes. - -## Risk-Reduction Ordering -- Phase 1 reduces ambiguity by locking behavior before any cleanup. -- Phase 2 captures most refactor value while constrained by Phase 1 protections. -- Phase 3 isolates high-risk interface changes into explicit review gating instead of mixing that risk into general cleanup. - -## Out of Scope -- Structural refactors outside the random-targeted file set. -- Speculative interface redesign unrelated to discovered deltas. -- Release/version process changes. diff --git a/tests/test_main_entrypoint.py b/tests/test_main_entrypoint.py index e19c998..f6cc76b 100644 --- a/tests/test_main_entrypoint.py +++ b/tests/test_main_entrypoint.py @@ -6,16 +6,17 @@ import continuous_refactoring.__main__ as main_module -def test_main_module_invokes_cli_main(monkeypatch: object) -> None: - seen: list[str] = [] +def test_main_module_invokes_cli_main_once(monkeypatch: object) -> None: + calls: list[str] = [] def fake_cli_main() -> None: - seen.append("called") + calls.append("called") monkeypatch.setattr("continuous_refactoring.cli.cli_main", fake_cli_main) monkeypatch.delitem(sys.modules, "continuous_refactoring.__main__", raising=False) runpy.run_module("continuous_refactoring", run_name="__main__") - assert seen == ["called"] + + assert calls == ["called"] def test_main_module_has_no_public_api() -> None: From 5c8c8390c4598ddeb33240d0fd7c7dc148ab7b99 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:21:14 -0700 Subject: [PATCH 26/41] continuous refactor: src/continuous_refactoring/migration_cli.py Why: Removes repeated CLI boundary logic in review/refine paths, reducing maintenance risk while preserving exact error and exit semantics. Validation: uv run pytest --- src/continuous_refactoring/migration_cli.py | 37 +++++++++++---------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/src/continuous_refactoring/migration_cli.py b/src/continuous_refactoring/migration_cli.py index 000cb79..51870b0 100644 --- a/src/continuous_refactoring/migration_cli.py +++ b/src/continuous_refactoring/migration_cli.py @@ -137,15 +137,7 @@ def handle_migration_doctor(args: argparse.Namespace) -> None: def handle_migration_review(args: argparse.Namespace) -> None: context = _resolve_context(error_code=2) - try: - target = resolve_migration_target( - live_dir=context.live_dir, - repo_root=context.repo_root, - value=args.target, - ) - except ContinuousRefactorError as error: - print(f"Error: {error}", file=sys.stderr) - raise SystemExit(2) from error + target = _resolve_target_or_exit(context=context, value=args.target, error_code=2) from continuous_refactoring.config import load_taste from continuous_refactoring.review_cli import ( @@ -175,15 +167,7 @@ def handle_migration_review(args: argparse.Namespace) -> None: def handle_migration_refine(args: argparse.Namespace) -> None: context = _resolve_context(error_code=2) feedback_text, feedback_source = _read_refine_feedback(args) - try: - target = resolve_migration_target( - live_dir=context.live_dir, - repo_root=context.repo_root, - value=args.target, - ) - except ContinuousRefactorError as error: - print(f"Error: {error}", file=sys.stderr) - raise SystemExit(2) from error + target = _resolve_target_or_exit(context=context, value=args.target, error_code=2) from continuous_refactoring.config import load_taste from continuous_refactoring.log_mirroring import LogMirroring @@ -290,6 +274,23 @@ def _read_refine_feedback(args: argparse.Namespace) -> tuple[str, FeedbackSource return text, source +def _resolve_target_or_exit( + *, + context: MigrationCliContext, + value: str, + error_code: int, +) -> MigrationTarget: + try: + return resolve_migration_target( + live_dir=context.live_dir, + repo_root=context.repo_root, + value=value, + ) + except ContinuousRefactorError as error: + print(f"Error: {error}", file=sys.stderr) + raise SystemExit(error_code) from error + + def _refine_publish_error_message(reason: str, slug: str) -> str: if "stale base snapshot" not in reason: return reason From 1638d06c273fd312423d7c5c389bde06d86c0aa6 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:24:13 -0700 Subject: [PATCH 27/41] continuous refactor: tests/test_run_once.py Why: Reduces duplicated control-flow in a high-signal test module, making baseline-validation invariants easier to maintain and less error-prone. Validation: uv run pytest --- tests/test_run_once.py | 63 +++++++++++++++++++----------------------- 1 file changed, 28 insertions(+), 35 deletions(-) diff --git a/tests/test_run_once.py b/tests/test_run_once.py index 9968c39..ab2fb45 100644 --- a/tests/test_run_once.py +++ b/tests/test_run_once.py @@ -39,6 +39,30 @@ def _planning_record(decision: str) -> DecisionRecord: ) +def _baseline_passes_then_refactor_fails( + test_command: str, + repo_root: Path, + stdout_path: Path, + stderr_path: Path, + **kwargs: object, +) -> CommandCapture: + if _is_baseline_validation(stdout_path): + return noop_tests( + test_command, + repo_root, + stdout_path, + stderr_path, + **kwargs, + ) + return failing_tests( + test_command, + repo_root, + stdout_path, + stderr_path, + **kwargs, + ) + + @pytest.mark.parametrize( ("kwargs", "needles"), [ @@ -193,7 +217,7 @@ def committing_agent(**kwargs: object) -> CommandCapture: validation_calls = 0 - def baseline_passes_then_fails( + def count_and_fail_after_baseline( test_command: str, repo_root: Path, stdout_path: Path, @@ -202,15 +226,7 @@ def baseline_passes_then_fails( ) -> CommandCapture: nonlocal validation_calls validation_calls += 1 - if _is_baseline_validation(stdout_path): - return noop_tests( - test_command, - repo_root, - stdout_path, - stderr_path, - **kwargs, - ) - return failing_tests( + return _baseline_passes_then_refactor_fails( test_command, repo_root, stdout_path, @@ -220,7 +236,7 @@ def baseline_passes_then_fails( monkeypatch.setattr( "continuous_refactoring.loop.run_tests", - baseline_passes_then_fails, + count_and_fail_after_baseline, ) args = make_run_once_args(run_once_env) @@ -295,32 +311,9 @@ def counting_agent(**kwargs: object) -> CommandCapture: monkeypatch.setattr("continuous_refactoring.loop.maybe_run_agent", counting_agent) - def baseline_passes_then_fails( - test_command: str, - repo_root: Path, - stdout_path: Path, - stderr_path: Path, - **kwargs: object, - ) -> CommandCapture: - if _is_baseline_validation(stdout_path): - return noop_tests( - test_command, - repo_root, - stdout_path, - stderr_path, - **kwargs, - ) - return failing_tests( - test_command, - repo_root, - stdout_path, - stderr_path, - **kwargs, - ) - monkeypatch.setattr( "continuous_refactoring.loop.run_tests", - baseline_passes_then_fails, + _baseline_passes_then_refactor_fails, ) args = make_run_once_args(run_once_env) From 8d4033b3236cb797adc40ccfb12bb4067e3942f1 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:27:01 -0700 Subject: [PATCH 28/41] continuous refactor: src/continuous_refactoring/cli.py Why: This reduces duplication in load-bearing path checks, making move invariants clearer and less error-prone with zero behavior drift. Validation: uv run pytest --- src/continuous_refactoring/cli.py | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/continuous_refactoring/cli.py b/src/continuous_refactoring/cli.py index ec347c1..1d50caa 100644 --- a/src/continuous_refactoring/cli.py +++ b/src/continuous_refactoring/cli.py @@ -484,7 +484,9 @@ def _configure_repo_taste( if not current.exists(): ensure_taste_file(destination) return - if current.resolve() == destination.resolve(): + current_resolved = current.resolve() + destination_resolved = destination.resolve() + if current_resolved == destination_resolved: return if not current.is_file(): raise ContinuousRefactorError( @@ -514,15 +516,17 @@ def _configure_live_migrations_dir( if current is None or not current.exists(): destination.mkdir(parents=True, exist_ok=True) return + current_resolved = current.resolve() + destination_resolved = destination.resolve() if not current.is_dir(): raise ContinuousRefactorError( f"Configured live migrations path is not a directory: {current}" ) - if current.resolve() == destination.resolve(): + if current_resolved == destination_resolved: return if ( - destination.resolve().is_relative_to(current.resolve()) - or current.resolve().is_relative_to(destination.resolve()) + destination_resolved.is_relative_to(current_resolved) + or current_resolved.is_relative_to(destination_resolved) ): raise ContinuousRefactorError( "Live migrations directory cannot be moved into itself or one of " From c6923c080789bd6c8809587b650c315c9d4b8c0d Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:29:56 -0700 Subject: [PATCH 29/41] continuous refactor: tests/test_scope_candidates.py Why: Makes candidate assembly truthful and less fragile by enforcing file-set uniqueness in one place, reducing future maintenance risk without behavior drift. Validation: uv run pytest --- .../scope_candidates.py | 23 ++++++++++++++----- tests/test_scope_candidates.py | 14 +++++++++++ 2 files changed, 31 insertions(+), 6 deletions(-) diff --git a/src/continuous_refactoring/scope_candidates.py b/src/continuous_refactoring/scope_candidates.py index b766c7b..dca2552 100644 --- a/src/continuous_refactoring/scope_candidates.py +++ b/src/continuous_refactoring/scope_candidates.py @@ -259,6 +259,15 @@ def _candidate_from_files( ) +def _append_candidate_if_unique( + candidates: list[ScopeCandidate], + candidate: ScopeCandidate, +) -> None: + if any(existing.files == candidate.files for existing in candidates): + return + candidates.append(candidate) + + def _record_support( support_lines: dict[str, list[str]], support_kinds: dict[str, list[_SupportKind]], @@ -406,10 +415,11 @@ def include_cross(same_dir: bool, support_kinds: tuple[_SupportKind, ...]) -> bo local_extras = tuple(local_ranked[: max_files - 1]) if local_extras: local_files = (seed_file, *local_extras) - candidates.append( + _append_candidate_if_unique( + candidates, _candidate_from_files( "local-cluster", seed_file, local_files, support.evidence, - ) + ), ) cross_ranked = _rank_paths( @@ -421,10 +431,11 @@ def include_cross(same_dir: bool, support_kinds: tuple[_SupportKind, ...]) -> bo cross_extras = tuple(cross_ranked[: max_files - 1]) if cross_extras: cross_files = (seed_file, *cross_extras) - cross_candidate = _candidate_from_files( - "cross-cluster", seed_file, cross_files, support.evidence, + _append_candidate_if_unique( + candidates, + _candidate_from_files( + "cross-cluster", seed_file, cross_files, support.evidence, + ), ) - if cross_candidate.files != candidates[-1].files: - candidates.append(cross_candidate) return tuple(candidates[:max_candidates]) diff --git a/tests/test_scope_candidates.py b/tests/test_scope_candidates.py index a531004..87ecd0b 100644 --- a/tests/test_scope_candidates.py +++ b/tests/test_scope_candidates.py @@ -210,3 +210,17 @@ def test_max_candidates_prunes_without_dropping_seed_candidate(tmp_path: Path) - "seed", "local-cluster", ] + + +def test_duplicate_candidate_file_sets_are_not_emitted(tmp_path: Path) -> None: + init_repo(tmp_path) + _write(tmp_path, "src/foo.py", "from .helpers import normalize\n") + _write(tmp_path, "src/helpers.py", "def normalize(value: str) -> str:\n return value\n") + _commit_all(tmp_path, "seed files") + + candidates = build_scope_candidates(_seed_target("src/foo.py"), tmp_path) + + assert [candidate.kind for candidate in candidates] == [ + "seed", + "local-cluster", + ] From 8a86bb27acccd8b6a5442e1a7aa40d57508262d3 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:32:56 -0700 Subject: [PATCH 30/41] continuous refactor: src/continuous_refactoring/refactor_attempts.py Why: Reduces indirection in attempt outcome construction, making retry/commit paths easier to read and maintain while keeping test-verified behavior intact. Validation: uv run pytest --- .../refactor_attempts.py | 39 ++----------------- 1 file changed, 3 insertions(+), 36 deletions(-) diff --git a/src/continuous_refactoring/refactor_attempts.py b/src/continuous_refactoring/refactor_attempts.py index fb4e428..8aa116a 100644 --- a/src/continuous_refactoring/refactor_attempts.py +++ b/src/continuous_refactoring/refactor_attempts.py @@ -103,39 +103,6 @@ def _retry_context(record: DecisionRecord) -> str: return "\n".join(lines) -def _decision_record( - *, - decision: str, - retry_recommendation: str, - target: str, - call_role: str, - phase_reached: str, - failure_kind: str, - summary: str, - next_retry_focus: str | None = None, - agent_last_message_path: Path | None = None, - agent_stdout_path: Path | None = None, - agent_stderr_path: Path | None = None, - tests_stdout_path: Path | None = None, - tests_stderr_path: Path | None = None, -) -> DecisionRecord: - return DecisionRecord( - decision=decision, - retry_recommendation=retry_recommendation, - target=target, - call_role=call_role, - phase_reached=phase_reached, - failure_kind=failure_kind, - summary=summary, - next_retry_focus=next_retry_focus, - agent_last_message_path=agent_last_message_path, - agent_stdout_path=agent_stdout_path, - agent_stderr_path=agent_stderr_path, - tests_stdout_path=tests_stdout_path, - tests_stderr_path=tests_stderr_path, - ) - - def _restore_and_retry( *, repo_root: Path, @@ -154,7 +121,7 @@ def _restore_and_retry( tests_stderr_path: Path | None = None, ) -> DecisionRecord: _reset_to_source_baseline(repo_root, head_before, preserved_workspace) - return _decision_record( + return DecisionRecord( decision="retry", retry_recommendation="same-target", target=target, @@ -439,7 +406,7 @@ def _run_refactor_attempt( agent_status.retry_recommendation or default_retry_recommendation(decision) ) - return _decision_record( + return DecisionRecord( decision=decision, retry_recommendation=retry_recommendation, target=target.description, @@ -476,7 +443,7 @@ def _run_refactor_attempt( phase="refactor", ) - return _decision_record( + return DecisionRecord( decision="commit", retry_recommendation="none", target=target.description, From 8cb13730a18ee269f03cadd6092b4f3506ed7592 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:34:20 -0700 Subject: [PATCH 31/41] continuous refactor: plan src-continuous-refactoring-prompts-py-20260524T003318 Why: planning.approaches accepted; next step: pick-best --- .../.planning/stages/approaches.stdout.md | 7 ++++ .../.planning/state.json | 25 +++++++++++++ .../approaches/composable-prompt-builders.md | 37 +++++++++++++++++++ .../approaches/prompt-surface-hardening.md | 34 +++++++++++++++++ .../approaches/split-prompts-by-domain.md | 37 +++++++++++++++++++ .../manifest.json | 12 ++++++ 6 files changed, 152 insertions(+) create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md new file mode 100644 index 0000000..81b8404 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md @@ -0,0 +1,7 @@ +Created 3 approach files in the planning workspace: + +- `approaches/prompt-surface-hardening.md` +- `approaches/composable-prompt-builders.md` +- `approaches/split-prompts-by-domain.md` + +Each includes strategy, tradeoffs, phased plan with `required_effort`, and risk profile, with interface-change human-review triggers called out explicitly. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json new file mode 100644 index 0000000..fbbd66d --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json @@ -0,0 +1,25 @@ +{ + "completed_steps": [ + { + "agent": "codex", + "completed_at": "2026-05-24T00:34:20.118-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "approaches", + "outcome": "completed", + "outputs": { + "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md" + } + } + ], + "feedback": [], + "final_decision": null, + "final_reason": null, + "next_step": "pick-best", + "review_findings": null, + "revision_base_step_counts": [], + "schema_version": 1, + "started_at": "2026-05-24T00:33:18.149-07:00", + "target": "src/continuous_refactoring/prompts.py", + "updated_at": "2026-05-24T00:34:20.118-07:00" +} diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md new file mode 100644 index 0000000..40ed306 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md @@ -0,0 +1,37 @@ +# Approach: Composable Prompt Builders + +## Strategy +Refactor `prompts.py` into clearer domain-focused builder sections (refactor/run prompts, planning prompts, migration review prompts, taste prompts), using typed intermediate builders that compose sections deterministically. Keep public function signatures and exported constants stable. + +## What Changes +- Introduce explicit section-builder helpers per prompt family. +- Normalize shared context rendering (taste block, work-dir/live-dir constraints, retry context) through one composition path. +- Reduce ad-hoc string concatenation in top-level composition functions. +- Update tests to validate both contract text and cross-family consistency. + +## Tradeoffs +- Pros: Better readability and change locality; easier future prompt edits with less accidental divergence. +- Cons: Moderate churn in a high-touch module; requires careful parity checks across many callers/tests. + +## Estimated Phases +1. Prompt family map + invariants spec + - Scope: identify families, shared clauses, and must-not-change boundary behavior. + - required_effort: `medium` +2. Introduce composable builders behind existing APIs + - Scope: refactor internals while preserving `__all__` exports and call contracts. + - required_effort: `high` +3. Consistency unification + - Scope: route duplicated context sections through shared builder utilities; remove dead local formatting paths. + - required_effort: `medium` +4. Test adaptation + parity checks + - Scope: keep existing assertions green; add parity checks for critical prompts and status block structure. + - required_effort: `medium` +5. Full validation + migration notes + - Scope: run full suite and surface interface-impact statement for human review if wording changed. + - required_effort: `low` + +## Risk Profile +- Overall risk: Medium. +- Primary risks: subtle text-order changes impacting tests or downstream parsing behavior. +- Mitigations: golden/parity assertions for high-risk prompts; incremental refactor by family; keep public exports unchanged. +- Human-review triggers: any changed clause that could alter operator expectations in planning/review flows. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md new file mode 100644 index 0000000..a77d166 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md @@ -0,0 +1,34 @@ +# Approach: Prompt Surface Hardening + +## Strategy +Keep `src/continuous_refactoring/prompts.py` as one module, but tighten its public contract and reduce regression risk by converting brittle literal checks into explicit, reusable output-contract anchors. Preserve all current CLI/runtime behavior and prompt semantics. + +## What Changes +- Add/normalize named prompt contract anchors (small constants/functions) for repeated required clauses. +- Replace duplicated inline phrase assembly with small local helpers where repetition is currently high. +- Keep all exported names stable unless dead exports are proven unused by repo-wide evidence. +- Expand `tests/test_prompts.py` around invariants that currently rely on scattered literal string checks. + +## Tradeoffs +- Pros: Lowest blast radius; fastest path to cleaner maintenance; strongest behavior preservation. +- Cons: Does not materially reduce module size or conceptual coupling; still string-template heavy. + +## Estimated Phases +1. Baseline + contract inventory + - Scope: map required clauses used by tests/callers; identify repetition and dead branches. + - required_effort: `low` +2. Internal prompt-contract consolidation + - Scope: extract repeated clauses to local helpers/constants without changing final rendered text contracts. + - required_effort: `medium` +3. Test hardening for prompt invariants + - Scope: add/adjust tests to validate stable contract points and taste injection expectations. + - required_effort: `medium` +4. Final verification + review note + - Scope: run targeted prompt tests and full pytest; document any intentional text-shape change for human review if present. + - required_effort: `low` + +## Risk Profile +- Overall risk: Low. +- Primary risks: accidental wording drift in prompts that breaks downstream parsing or brittle tests. +- Mitigations: keep contract strings centralized, preserve status-block delimiters verbatim, run full `uv run pytest` before completion. +- Human-review triggers: any intentional user-visible CLI prompt wording change beyond internal deduplication. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md new file mode 100644 index 0000000..d1f7db0 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md @@ -0,0 +1,37 @@ +# Approach: Split Prompts by Domain Modules + +## Strategy +Split `src/continuous_refactoring/prompts.py` into domain-focused modules (for example: `prompts_refactor.py`, `prompts_planning.py`, `prompts_phase.py`, `prompts_taste.py`) and keep a thin package-level `prompts.py` facade that preserves the current import surface. + +## What Changes +- Move prompt constants/composers into domain modules with explicit `__all__`. +- Keep `continuous_refactoring.prompts` as compatibility boundary for current imports. +- Remove dead helpers discovered during split. +- Expand tests to cover both package-level exports and representative domain-level behaviors. + +## Tradeoffs +- Pros: Stronger module boundaries; easier ownership and focused edits; lower per-file cognitive load. +- Cons: Highest churn; touches many imports/tests; increases coordination risk with package export uniqueness and boundary rules. + +## Estimated Phases +1. Boundary design + export contract plan + - Scope: define module split and exact preserved public API from `continuous_refactoring.prompts`. + - required_effort: `high` +2. Mechanical extraction with compatibility facade + - Scope: move code into domain modules and re-export through existing boundary. + - required_effort: `xhigh` +3. Caller/test update pass + - Scope: update internals only where beneficial; keep external import behavior stable. + - required_effort: `high` +4. Cleanup + dead path deletion + - Scope: remove obsolete helpers and duplication revealed by split. + - required_effort: `medium` +5. Full verification + interface review checkpoint + - Scope: run full suite; explicitly call out any import/behavior change requiring human decision. + - required_effort: `medium` + +## Risk Profile +- Overall risk: High. +- Primary risks: accidental API/export drift, import cycle regressions, broad test fallout. +- Mitigations: strict facade compatibility, staged extraction, explicit export tests, package import uniqueness checks. +- Human-review triggers: any change to public import paths, prompt text contracts, or CLI-facing behavior. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json new file mode 100644 index 0000000..ff88b2d --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -0,0 +1,12 @@ +{ + "awaiting_human_review": false, + "cooldown_until": null, + "created_at": "2026-05-24T00:33:18.149-07:00", + "current_phase": "", + "human_review_reason": null, + "last_touch": "2026-05-24T00:34:20.118-07:00", + "name": "src-continuous-refactoring-prompts-py-20260524T003318", + "phases": [], + "status": "planning", + "wake_up_on": null +} From 5590b208e041b3b4388b6921c270a915bd5d53a8 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:34:33 -0700 Subject: [PATCH 32/41] continuous refactor: planning/src-continuous-refactoring-prompts-py-20260524T003318/pick-best Why: planning.pick-best accepted; next step: expand --- .../.planning/stages/pick-best.stdout.md | 13 +++++++++++++ .../.planning/state.json | 15 +++++++++++++-- .../manifest.json | 2 +- 3 files changed, 27 insertions(+), 3 deletions(-) create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md new file mode 100644 index 0000000..a19b1b8 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md @@ -0,0 +1,13 @@ +**Chosen approach: `prompt-surface-hardening`** + +Best balance is clear: it’s the lowest-risk path that still improves maintainability and verifiability in a load-bearing prompt module. + +Why this wins: +1. **Risk control:** `prompts.py` is contract-sensitive (tests, parsing, CLI/operator expectations). This approach minimizes blast radius and keeps behavior stable. +2. **Incremental verifiability:** each phase has tight checkpoints (inventory → consolidate internals → harden invariants → full suite), so regressions show up early. +3. **Clarity without over-architecture:** it reduces repetition via small helpers/anchors, matching taste guidance to prefer small abstractions and avoid speculative restructuring. +4. **Interface safety:** it explicitly preserves exports and semantics unless intentionally changed and surfaced for review, aligned with your “human-review for released interfaces” rule. + +Runner-ups (short): +- `composable-prompt-builders`: good design direction, but more churn and parity risk than needed for this migration. +- `split-prompts-by-domain`: strongest structural cleanup, but high churn/high coordination risk; wrong move unless we’re intentionally paying that cost now. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json index fbbd66d..b558f9e 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json @@ -10,16 +10,27 @@ "outputs": { "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-24T00:34:33.019-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "pick-best", + "outcome": "completed", + "outputs": { + "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "pick-best", + "next_step": "expand", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-24T00:33:18.149-07:00", "target": "src/continuous_refactoring/prompts.py", - "updated_at": "2026-05-24T00:34:20.118-07:00" + "updated_at": "2026-05-24T00:34:33.019-07:00" } diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index ff88b2d..f19915d 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-24T00:33:18.149-07:00", "current_phase": "", "human_review_reason": null, - "last_touch": "2026-05-24T00:34:20.118-07:00", + "last_touch": "2026-05-24T00:34:33.018-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [], "status": "planning", From 192c8c6603248b6a246b3f0980f8340f93ac20fd Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:35:45 -0700 Subject: [PATCH 33/41] continuous refactor: planning/src-continuous-refactoring-prompts-py-20260524T003318/expand Why: planning.expand accepted; next step: review --- .../.planning/stages/expand.stdout.md | 14 +++++ .../.planning/state.json | 15 +++++- .../manifest.json | 39 ++++++++++++-- .../phase-1-contract-inventory.md | 35 +++++++++++++ .../phase-2-contract-consolidation.md | 35 +++++++++++++ .../phase-3-test-hardening.md | 35 +++++++++++++ ...se-4-final-verification-and-review-note.md | 32 ++++++++++++ .../plan.md | 51 +++++++++++++++++++ 8 files changed, 251 insertions(+), 5 deletions(-) create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md new file mode 100644 index 0000000..96eaa43 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md @@ -0,0 +1,14 @@ +Created the migration planning artifacts in the staged workspace: + +- `plan.md` +- `phase-1-contract-inventory.md` +- `phase-2-contract-consolidation.md` +- `phase-3-test-hardening.md` +- `phase-4-final-verification-and-review-note.md` + +What’s included: +- Numbered phases with strict ordering and risk-reduction flow. +- Explicit dependency graph and per-phase dependencies in `plan.md`. +- Each phase has clear scope, `## Precondition`, `## Definition of Done`, and validation steps. +- `required_effort`/`effort_reason` only on phases that need it (Phases 2 and 3, both `medium`). +- Preconditions are phase-local and do not restate baseline-green invariants. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json index b558f9e..ff494d7 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json @@ -21,16 +21,27 @@ "outputs": { "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-24T00:35:45.784-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "expand", + "outcome": "completed", + "outputs": { + "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "expand", + "next_step": "review", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-24T00:33:18.149-07:00", "target": "src/continuous_refactoring/prompts.py", - "updated_at": "2026-05-24T00:34:33.019-07:00" + "updated_at": "2026-05-24T00:35:45.784-07:00" } diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index f19915d..ef9132a 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -2,11 +2,44 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-24T00:33:18.149-07:00", - "current_phase": "", + "current_phase": "contract-inventory", "human_review_reason": null, - "last_touch": "2026-05-24T00:34:33.018-07:00", + "last_touch": "2026-05-24T00:35:45.783-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", - "phases": [], + "phases": [ + { + "done": false, + "effort_reason": null, + "file": "phase-1-contract-inventory.md", + "name": "contract-inventory", + "precondition": "- Migration status is active for this phase and no earlier phase is incomplete. - `src/continuous_refactoring/prompts.py` and `tests/test_prompts.py` are present and readable. - No concurrent edit is in progress on the same migration workspace.", + "required_effort": null + }, + { + "done": false, + "effort_reason": "Consolidation changes many string-assembly sites where subtle wording drift could break parsing and tests.", + "file": "phase-2-contract-consolidation.md", + "name": "contract-consolidation", + "precondition": "- Phase 1 contract inventory is complete and available. - The consolidation target clauses/helpers are identified with explicit must-preserve constraints. - Current public symbols and call sites that depend on prompt text shape are identified.", + "required_effort": "medium" + }, + { + "done": false, + "effort_reason": "Requires careful test-shape redesign to reduce brittleness without weakening contract checks.", + "file": "phase-3-test-hardening.md", + "name": "test-hardening", + "precondition": "- Phase 2 consolidation is complete and prompt outputs are stable. - Invariant anchors to assert are finalized from the contract inventory. - No unresolved intentional interface-change decisions remain for this migration.", + "required_effort": "medium" + }, + { + "done": false, + "effort_reason": null, + "file": "phase-4-final-verification-and-review-note.md", + "name": "final-verification-and-review-note", + "precondition": "- Phases 1\u20133 are complete. - Consolidated prompt logic and hardened tests are merged in this migration workspace. - Any potential interface-sensitive wording deltas are enumerated.", + "required_effort": null + } + ], "status": "planning", "wake_up_on": null } diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md new file mode 100644 index 0000000..7e905be --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md @@ -0,0 +1,35 @@ +# Phase 1: Contract Inventory + +## Objective +Create a precise inventory of prompt contracts in `src/continuous_refactoring/prompts.py` so later consolidation work is constrained by explicit invariants. + +## Scope +- Inspect prompt builders/templates in `src/continuous_refactoring/prompts.py`. +- Map repeated required clauses and existing literal contract points. +- Identify test coverage gaps in `tests/test_prompts.py` relative to required invariants. +- No behavior changes. + +## Precondition +- Migration status is active for this phase and no earlier phase is incomplete. +- `src/continuous_refactoring/prompts.py` and `tests/test_prompts.py` are present and readable. +- No concurrent edit is in progress on the same migration workspace. + +## Implementation Notes +- Produce a short inventory artifact inside this phase document (or a linked note) listing: + - Required stable prompt sections (including `## Taste` requirements). + - Contract-sensitive delimiters/phrases used for parsing or downstream decisions. + - Repetition candidates safe for consolidation without semantic drift. + - Known risky areas where wording drift could break behavior/tests. +- Keep the inventory grounded in current code and tests, not speculative future shape. + +## Validation +- Confirm no code or behavior changed. +- Run targeted prompt tests if needed to confirm baseline understanding: + - `uv run pytest tests/test_prompts.py` + +## Definition of Done +- A concrete inventory of prompt contracts exists and is specific enough to drive Phase 2 edits. +- Repetition/consolidation candidates are identified with explicit “must-preserve” constraints. +- Any ambiguous contract points are called out for conservative handling in later phases. +- The configured validation command passes: + - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md new file mode 100644 index 0000000..9f75708 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md @@ -0,0 +1,35 @@ +# Phase 2: Contract Consolidation + +## Objective +Refactor internal prompt construction in `src/continuous_refactoring/prompts.py` to reduce duplication by introducing small local anchors/helpers while preserving rendered contract semantics. + +required_effort: medium +effort_reason: Consolidation changes many string-assembly sites where subtle wording drift could break parsing and tests. + +## Scope +- Add/normalize small local constants/helpers for repeated required clauses. +- Replace high-repetition inline phrase assembly with those helpers. +- Keep exports stable unless dead exports are proven unused and explicitly surfaced. +- Avoid module-split or architecture-level refactors. + +## Precondition +- Phase 1 contract inventory is complete and available. +- The consolidation target clauses/helpers are identified with explicit must-preserve constraints. +- Current public symbols and call sites that depend on prompt text shape are identified. + +## Implementation Notes +- Apply minimal, localized edits in `src/continuous_refactoring/prompts.py`. +- Preserve load-bearing delimiters/headers and required sections verbatim unless an intentional change is documented. +- Favor readability and short helper abstractions; avoid abstraction layers that obscure prompt outputs. + +## Validation +- Run prompt-focused tests: + - `uv run pytest tests/test_prompts.py` +- If any related tests exist for prompt parsing/decisions, run them as targeted smoke checks. + +## Definition of Done +- Repeated contract clauses are consolidated into clear local helpers/constants. +- Rendered prompt outputs preserve required semantics from Phase 1 inventory. +- Public/exported behavior remains unchanged unless explicitly documented for review. +- The configured validation command passes: + - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md new file mode 100644 index 0000000..e3d0f46 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md @@ -0,0 +1,35 @@ +# Phase 3: Test Hardening + +## Objective +Strengthen `tests/test_prompts.py` so core prompt invariants are explicit and resilient to safe refactors while still detecting contract drift. + +required_effort: medium +effort_reason: Requires careful test-shape redesign to reduce brittleness without weakening contract checks. + +## Scope +- Add/adjust tests for required invariant anchors identified in Phase 1. +- Ensure `## Taste` injection coverage remains explicit across required prompt templates. +- Replace scattered brittle literal checks where appropriate with focused invariant assertions. +- Keep tests outcome-focused; avoid mock-heavy tests unless boundary isolation is necessary. + +## Precondition +- Phase 2 consolidation is complete and prompt outputs are stable. +- Invariant anchors to assert are finalized from the contract inventory. +- No unresolved intentional interface-change decisions remain for this migration. + +## Implementation Notes +- Prefer assertions on stable contract points (section headers, delimiters, mandatory clauses). +- Keep tests readable and maintainable; avoid overfitting to incidental formatting. +- Preserve or improve failure clarity so regressions identify which contract was broken. + +## Validation +- Run prompt tests first: + - `uv run pytest tests/test_prompts.py` +- Run any additional directly impacted tests. + +## Definition of Done +- Tests clearly encode the required prompt invariants and catch meaningful contract regressions. +- Brittle/duplicative literal checks are reduced where they do not represent true contracts. +- Taste injection requirements remain enforced by tests. +- The configured validation command passes: + - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md new file mode 100644 index 0000000..bae8739 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md @@ -0,0 +1,32 @@ +# Phase 4: Final Verification and Review Note + +## Objective +Perform final verification and produce a concise review note confirming contract preservation or clearly surfacing any intentional interface-level wording change. + +## Scope +- Final pass on changed prompt/test files for readability and contract consistency. +- Execute full validation. +- Record review note for human review if any user-visible/interface-sensitive text shape changed intentionally. + +## Precondition +- Phases 1–3 are complete. +- Consolidated prompt logic and hardened tests are merged in this migration workspace. +- Any potential interface-sensitive wording deltas are enumerated. + +## Implementation Notes +- Validate against Phase 1 inventory and Phase 3 assertions. +- If no interface change occurred, explicitly state that in the review note. +- If interface change occurred, name exact behavior/text-shape deltas and rationale. + +## Validation +- Run full suite: + - `uv run pytest` + +## Definition of Done +- Final verification confirms the repository remains shippable after this migration. +- A review note exists that either: + - confirms no intentional interface-level prompt contract change, or + - explicitly documents each intentional interface-level change for human review. +- No open migration-phase blockers remain. +- The configured validation command passes: + - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md new file mode 100644 index 0000000..e7df86a --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md @@ -0,0 +1,51 @@ +# Migration Plan: Prompt Surface Hardening (`src/continuous_refactoring/prompts.py`) + +## Goal +Harden the prompt surface in `src/continuous_refactoring/prompts.py` by consolidating repeated contract clauses and strengthening invariant tests, while preserving runtime/CLI behavior and package interface expectations. + +## Scope +- In scope: + - `src/continuous_refactoring/prompts.py` + - `tests/test_prompts.py` + - Related tests only when needed to prove unchanged behavior +- Out of scope: + - Splitting `prompts.py` into multiple modules + - New runtime dependencies + - Behavioral changes to CLI/system interfaces unless intentionally surfaced for review + +## Phase Plan +1. [phase-1-contract-inventory.md](phase-1-contract-inventory.md) +2. [phase-2-contract-consolidation.md](phase-2-contract-consolidation.md) +3. [phase-3-test-hardening.md](phase-3-test-hardening.md) +4. [phase-4-final-verification-and-review-note.md](phase-4-final-verification-and-review-note.md) + +## Dependency Graph +```mermaid +graph TD + P1[Phase 1: Contract Inventory] --> P2[Phase 2: Contract Consolidation] + P2 --> P3[Phase 3: Test Hardening] + P3 --> P4[Phase 4: Final Verification + Review Note] +``` + +## Phase Dependencies +- Phase 1 has no migration-phase blockers. +- Phase 2 depends on completed Phase 1 inventory outputs. +- Phase 3 depends on consolidated prompt anchors/helpers from Phase 2. +- Phase 4 depends on completed Phase 3 tests and finalized text-shape verification. + +## Validation Strategy +- Per-phase validation (independent checks): + - Each phase specifies its own targeted validation commands and evidence. + - Each phase must leave the repo in a shippable state. +- Global validation gates: + - The configured validation command (`uv run pytest`) must pass for every completed phase. +- Contract-safety checks: + - Preserve required prompt sections and status-block formatting semantics. + - Preserve `## Taste` injection behavior across all prompt templates covered by `tests/test_prompts.py`. + - Preserve package/API expectations (no unreviewed export/behavior changes). + +## Risk Controls +- Keep changes incremental and small per phase. +- Prefer local helpers/constants over structural rewrites. +- Treat any intentional user-visible prompt wording change as human-review material and document it in Phase 4. +- If any contract cannot be preserved exactly, stop and surface the specific interface delta before proceeding. From 3a055191a0e3758caaf7bc2c555906b289afdabf Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:36:25 -0700 Subject: [PATCH 34/41] continuous refactor: planning/src-continuous-refactoring-prompts-py-20260524T003318/review Why: planning.review accepted; next step: final-review --- .../.planning/stages/review.stdout.md | 1 + .../.planning/state.json | 15 +++++++++++++-- .../manifest.json | 2 +- 3 files changed, 15 insertions(+), 3 deletions(-) create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md new file mode 100644 index 0000000..9e06085 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md @@ -0,0 +1 @@ +1. no findings. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json index ff494d7..06c73f3 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json @@ -32,16 +32,27 @@ "outputs": { "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-24T00:36:25.825-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "review", + "outcome": "clear", + "outputs": { + "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md" + } } ], "feedback": [], "final_decision": null, "final_reason": null, - "next_step": "review", + "next_step": "final-review", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-24T00:33:18.149-07:00", "target": "src/continuous_refactoring/prompts.py", - "updated_at": "2026-05-24T00:35:45.784-07:00" + "updated_at": "2026-05-24T00:36:25.825-07:00" } diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index ef9132a..b91def1 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-24T00:33:18.149-07:00", "current_phase": "contract-inventory", "human_review_reason": null, - "last_touch": "2026-05-24T00:35:45.783-07:00", + "last_touch": "2026-05-24T00:36:25.824-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [ { From 1b3dff5d2dfda68962dbeddddc54881be5f6d214 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:37:03 -0700 Subject: [PATCH 35/41] continuous refactor: planning/src-continuous-refactoring-prompts-py-20260524T003318/final-review Why: phased, low-blast-radius plan with correct effort tiers and no invalid precondition coupling to baseline/validation evidence --- .../.planning/stages/final-review.stdout.md | 11 +++++++++++ .../.planning/state.json | 19 +++++++++++++++---- .../manifest.json | 4 ++-- 3 files changed, 28 insertions(+), 6 deletions(-) create mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md new file mode 100644 index 0000000..66d02d0 --- /dev/null +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md @@ -0,0 +1,11 @@ +Assessment: +- Safe for automatic execution: Yes, with guardrails already encoded. The phases are incremental, dependencies are linear, and risky wording drift is explicitly constrained and tested. +- Human judgment checkpoints: Conditional only. The plan correctly routes intentional interface-sensitive wording changes to human review in Phase 4; otherwise it can run unattended. +- Fundamental flaws: None found. + +Gate checks: +- Effort labels are lowest safe tiers: `low` (inventory/final verify), `medium` (consolidation/test hardening) are appropriate; no phase is over-labeled. +- Preconditions are valid: none require baseline-green or fresh validation evidence as a precondition; validation is in Definition of Done where it belongs. +- Automation safety: strong, because contract inventory precedes edits, tests are hardened before final verification, and full `uv run pytest` is required per phase completion. + +final-decision: approve-auto — phased, low-blast-radius plan with correct effort tiers and no invalid precondition coupling to baseline/validation evidence diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json index 06c73f3..5293eb5 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json @@ -43,16 +43,27 @@ "outputs": { "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md" } + }, + { + "agent": "codex", + "completed_at": "2026-05-24T00:37:03.286-07:00", + "effort": "low", + "model": "gpt-5.3-codex", + "name": "final-review", + "outcome": "approve-auto", + "outputs": { + "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md" + } } ], "feedback": [], - "final_decision": null, - "final_reason": null, - "next_step": "final-review", + "final_decision": "approve-auto", + "final_reason": "phased, low-blast-radius plan with correct effort tiers and no invalid precondition coupling to baseline/validation evidence", + "next_step": "terminal-ready", "review_findings": null, "revision_base_step_counts": [], "schema_version": 1, "started_at": "2026-05-24T00:33:18.149-07:00", "target": "src/continuous_refactoring/prompts.py", - "updated_at": "2026-05-24T00:36:25.825-07:00" + "updated_at": "2026-05-24T00:37:03.286-07:00" } diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index b91def1..cb31227 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -4,7 +4,7 @@ "created_at": "2026-05-24T00:33:18.149-07:00", "current_phase": "contract-inventory", "human_review_reason": null, - "last_touch": "2026-05-24T00:36:25.824-07:00", + "last_touch": "2026-05-24T00:37:03.288-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [ { @@ -40,6 +40,6 @@ "required_effort": null } ], - "status": "planning", + "status": "ready", "wake_up_on": null } From 5c2f06dd68cf901eeecd65e76b77c4dbc99445c7 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:39:55 -0700 Subject: [PATCH 36/41] continuous refactor: migration/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md Why: Captures load-bearing prompt/test contracts up front so Phase 2 consolidation can reduce duplication without semantic drift or parser breakage. Validation: uv run pytest --- .../manifest.json | 6 +-- .../phase-1-contract-inventory.md | 54 +++++++++++++++++++ 2 files changed, 57 insertions(+), 3 deletions(-) diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index cb31227..58c0b9f 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -2,13 +2,13 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-24T00:33:18.149-07:00", - "current_phase": "contract-inventory", + "current_phase": "contract-consolidation", "human_review_reason": null, - "last_touch": "2026-05-24T00:37:03.288-07:00", + "last_touch": "2026-05-24T00:39:55.355-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [ { - "done": false, + "done": true, "effort_reason": null, "file": "phase-1-contract-inventory.md", "name": "contract-inventory", diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md index 7e905be..49474c7 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md @@ -33,3 +33,57 @@ Create a precise inventory of prompt contracts in `src/continuous_refactoring/pr - Any ambiguous contract points are called out for conservative handling in later phases. - The configured validation command passes: - `uv run pytest` + +## Contract Inventory (Phase 1 Artifact) + +### Stable sections that must be preserved +- `compose_full_prompt()` output section shape and order: + - `Attempt {n}` + - base prompt body + - `REQUIRED_PREAMBLE` literal + - `## Refactoring Taste` + - optional `## Target Files` + - optional `## Scope` + - `## Validation` + - optional retry-context triple (`## Retry Context`, retry body, warning line) + - optional fix amendment appended at end +- Taste-injected prompt constants must include both terms checked by tests: + - contains `"taste"` (case-insensitive) + - contains `"injected by the caller"` +- Planning expand/review prompts must preserve `## Precondition` vs `## Definition of Done` terminology split and explicit anti-conflation language. +- `DEFAULT_REFACTORING_PROMPT` and `PHASE_EXECUTION_PROMPT` must embed the status block delimiters. + +### Contract-sensitive delimiters and phrases +- Status block delimiters are strict literals: + - `BEGIN_CONTINUOUS_REFACTORING_STATUS` + - `END_CONTINUOUS_REFACTORING_STATUS` +- Output-contract literals used by downstream parsing/tests: + - classifier: `decision: cohesive-cleanup`, `decision: needs-plan` + - final review: `final-decision: approve-auto|approve-needs-human|reject` + - ready check: `ready: yes|no|unverifiable` +- Driver-ownership clause in default refactor prompt: + - contains `Do not create git commits yourself.` +- Phase-ready wording guardrails (fresh evidence is not a human-review blocker) are assertion-backed and must remain semantically intact. + +### Repetition candidates safe for consolidation (with must-preserve constraints) +- Repeated `"Refactoring taste is injected by the caller..."` lines across classifier/planning/phase/review prompts. + - Must preserve exact `"injected by the caller"` phrase where currently tested. +- Repeated work-dir/live-dir mutation guard lines in planning/review composed prompts. + - Must preserve: + - `Writable target: work dir only.` + - `Do not mutate the live migration directory.` +- Repeated status block contract text in refactoring/phase-execution prompts. + - Must preserve all field names and begin/end markers. +- Repeated planning artifact path references (`.planning/state.json`, `.planning/stages/`, approaches path forms). + - Safe to centralize as constants/helpers if resulting rendered text is unchanged. + +### Known risky areas for wording drift +- Contract lines with finite allowed tokens (`decision:*`, `ready:*`, `final-decision:*`) are parser/test sensitive. +- Taste mention tests are substring-based; removing `"injected by the caller"` from any `_TASTE_INJECTED_PROMPTS` member will fail tests. +- `compose_full_prompt()` section ordering is not fully snapshot-tested but is behavior-significant for readability and downstream operator expectations; treat as must-preserve in consolidation. +- `_CONTINUOUS_REFACTORING_STATUS_BLOCK` embedded guidance fields are partially assertion-backed (`commit_rationale`, `why the refactor`) and operationally load-bearing. + +### Coverage gaps to handle conservatively in Phase 2 +- No test asserts `compose_full_prompt()` section ordering or exact spacing/newline layout. +- No test asserts exact text for many untested prompt constants (e.g., interview/refine/upgrade templates) beyond indirect behavior; avoid semantic rewrites there during consolidation. +- No test directly locks helper-level behavior (`_join_sections`, `_first_scope`, `_retry_context_sections`); preserve behavior when extracting shared builders. From 50f2b209aeee6ff061b9aba82a459e62ed39c0f7 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:43:13 -0700 Subject: [PATCH 37/41] continuous refactor: migration/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md Why: Reduces duplication in prompt assembly while preserving contract-critical wording and keeping future edits safer. Validation: uv run pytest --- .../manifest.json | 6 +-- src/continuous_refactoring/prompts.py | 52 ++++++++++++------- 2 files changed, 36 insertions(+), 22 deletions(-) diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index 58c0b9f..a8cc3bd 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -2,9 +2,9 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-24T00:33:18.149-07:00", - "current_phase": "contract-consolidation", + "current_phase": "test-hardening", "human_review_reason": null, - "last_touch": "2026-05-24T00:39:55.355-07:00", + "last_touch": "2026-05-24T00:43:13.180-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [ { @@ -16,7 +16,7 @@ "required_effort": null }, { - "done": false, + "done": true, "effort_reason": "Consolidation changes many string-assembly sites where subtle wording drift could break parsing and tests.", "file": "phase-2-contract-consolidation.md", "name": "contract-consolidation", diff --git a/src/continuous_refactoring/prompts.py b/src/continuous_refactoring/prompts.py index 8152064..5cd4287 100644 --- a/src/continuous_refactoring/prompts.py +++ b/src/continuous_refactoring/prompts.py @@ -57,6 +57,18 @@ def _join_sections(*sections: str | None) -> str: return "\n\n".join(section for section in sections if section) +def _taste_section(taste: str) -> str: + return f"## Taste\n{taste}" + + +def _validation_section(validation_command: str) -> str: + return f"## Validation\nRun: `{validation_command}`" + + +def _with_retry_context(sections: list[str], retry_context: str | None) -> list[str]: + return [*sections, *_retry_context_sections(retry_context)] + + def _format_target_files(files: tuple[str, ...]) -> str | None: file_lines: list[str] = [] for file_path in files: @@ -446,16 +458,15 @@ def compose_full_prompt( retry_context: str | None = None, fix_amendment: str | None = None, ) -> str: - sections: list[str] = [ + sections: list[str] = _with_retry_context([ f"Attempt {attempt}", base_prompt, REQUIRED_PREAMBLE, - f"## Refactoring Taste\n{taste}", + "## Refactoring Taste\n" + taste, _format_target_files(target.files), _first_scope(target.scoping, scope_instruction), - f"## Validation\nRun: `{validation_command}`", - *_retry_context_sections(retry_context), - ] + _validation_section(validation_command), + ], retry_context) if fix_amendment: sections.append(fix_amendment) return _join_sections(*sections) @@ -774,7 +785,7 @@ def compose_classifier_prompt(target: Target, taste: str) -> str: f"## Target\n{target.description}", _format_target_files(target.files), _first_scope(target.scoping), - f"## Taste\n{taste}", + _taste_section(taste), ) @@ -789,7 +800,7 @@ def compose_scope_selection_prompt( _format_target_files(target.files), _heading_section("Candidates", _format_scope_candidates(candidates)), _first_scope(target.scoping), - f"## Taste\n{taste}", + _taste_section(taste), ) @@ -806,7 +817,7 @@ def compose_planning_prompt( _format_effort_budget(effort_budget), f"## Migration\n{migration_name}", f"## Context\n{context}" if context else None, - f"## Taste\n{taste}", + _taste_section(taste), ) @@ -817,7 +828,7 @@ def compose_phase_ready_prompt( PHASE_READY_CHECK_PROMPT, f"## Phase\n{_format_phase_summary(phase)}", f"## Manifest\n{_format_manifest_summary(manifest)}", - f"## Taste\n{taste}", + _taste_section(taste), ) @@ -828,14 +839,13 @@ def compose_phase_execution_prompt( validation_command: str, retry_context: str | None = None, ) -> str: - sections: list[str] = [ + sections: list[str] = _with_retry_context([ PHASE_EXECUTION_PROMPT, f"## Phase\n{_format_phase_summary(phase)}", f"## Manifest\n{_format_manifest_summary(manifest)}", - f"## Taste\n{taste}", - f"## Validation\nRun: `{validation_command}`", - *_retry_context_sections(retry_context), - ] + _taste_section(taste), + _validation_section(validation_command), + ], retry_context) return _join_sections(*sections) @@ -888,6 +898,13 @@ def compose_phase_execution_prompt( - Any plan or phase file updates MUST be written before exiting\ """ +_WORKSPACE_MUTATION_CONTRACT_LINES = ( + "Writable target: staged work dir only.\n" + "Writable target: work dir only.\n" + "The live migration directory is read-only reference material.\n" + "Do not mutate the live migration directory." +) + def compose_migration_review_prompt( migration_name: str, @@ -908,10 +925,7 @@ def compose_migration_review_prompt( f"Staged work dir: {work_dir}\n" f"Work dir: {work_dir}\n" f"Live migration dir: {live_dir}\n" - "Writable target: staged work dir only.\n" - "Writable target: work dir only.\n" - "The live migration directory is read-only reference material.\n" - "Do not mutate the live migration directory." + + _WORKSPACE_MUTATION_CONTRACT_LINES ), f"## Human Review\n{reason}", ( @@ -933,5 +947,5 @@ def compose_migration_review_prompt( "Current phase file: (none)\n" "Current phase name: (none)" ) - sections.append(f"## Taste\n{taste}") + sections.append(_taste_section(taste)) return _join_sections(*sections) From 54d33f409d2e25d285a7df1fbceb4340431e185b Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 00:46:44 -0700 Subject: [PATCH 38/41] continuous refactor: migration/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md Why: Improves test resilience and readability while preserving strict contract enforcement for prompt parsing and required sections. Validation: uv run pytest --- .../manifest.json | 6 +- tests/test_prompts.py | 69 ++++++++++++++++--- 2 files changed, 64 insertions(+), 11 deletions(-) diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index a8cc3bd..c76d8dd 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -2,9 +2,9 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-24T00:33:18.149-07:00", - "current_phase": "test-hardening", + "current_phase": "final-verification-and-review-note", "human_review_reason": null, - "last_touch": "2026-05-24T00:43:13.180-07:00", + "last_touch": "2026-05-24T00:46:44.544-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [ { @@ -24,7 +24,7 @@ "required_effort": "medium" }, { - "done": false, + "done": true, "effort_reason": "Requires careful test-shape redesign to reduce brittleness without weakening contract checks.", "file": "phase-3-test-hardening.md", "name": "test-hardening", diff --git a/tests/test_prompts.py b/tests/test_prompts.py index 0e8d32f..a97f4d5 100644 --- a/tests/test_prompts.py +++ b/tests/test_prompts.py @@ -79,6 +79,23 @@ ) +def _assert_contains_in_order(text: str, fragments: tuple[str, ...]) -> None: + cursor = 0 + for fragment in fragments: + index = text.find(fragment, cursor) + assert index != -1, f"Missing fragment: {fragment!r}" + cursor = index + len(fragment) + + +def _assert_output_contract_variants( + prompt: str, + label: str, + variants: tuple[str, ...], +) -> None: + for variant in variants: + assert f"{label}: {variant}" in prompt + + def _target() -> Target: return Target( description="Clean up auth module", @@ -134,8 +151,11 @@ def _terminal_ready_state(repo_root: Path, mig_root: Path) -> PlanningState: # --------------------------------------------------------------------------- def test_classifier_output_contract() -> None: - assert "decision: cohesive-cleanup" in CLASSIFIER_PROMPT - assert "decision: needs-plan" in CLASSIFIER_PROMPT + _assert_output_contract_variants( + CLASSIFIER_PROMPT, + "decision", + ("cohesive-cleanup", "needs-plan"), + ) def test_refactoring_prompt_defers_commit_to_driver() -> None: @@ -156,15 +176,19 @@ def test_phase_execution_prompt_has_status_block_contract() -> None: def test_final_review_output_contract() -> None: - assert "final-decision: approve-auto" in PLANNING_FINAL_REVIEW_PROMPT - assert "final-decision: approve-needs-human" in PLANNING_FINAL_REVIEW_PROMPT - assert "final-decision: reject" in PLANNING_FINAL_REVIEW_PROMPT + _assert_output_contract_variants( + PLANNING_FINAL_REVIEW_PROMPT, + "final-decision", + ("approve-auto", "approve-needs-human", "reject"), + ) def test_ready_check_output_contract() -> None: - assert "ready: yes" in PHASE_READY_CHECK_PROMPT - assert "ready: no" in PHASE_READY_CHECK_PROMPT - assert "ready: unverifiable" in PHASE_READY_CHECK_PROMPT + _assert_output_contract_variants( + PHASE_READY_CHECK_PROMPT, + "ready", + ("yes", "no", "unverifiable"), + ) def test_phase_ready_prompt_uses_precondition_terminology() -> None: @@ -529,6 +553,35 @@ def test_compose_full_prompt_includes_retry_context_heading() -> None: assert "validation failed after refactor" in result +def test_compose_full_prompt_keeps_section_order_contract() -> None: + result = compose_full_prompt( + base_prompt="BASE", + taste=_TASTE, + target=_target(), + scope_instruction="fallback scope", + validation_command="uv run pytest", + attempt=3, + retry_context="retry context", + fix_amendment="## Optional Amendment\nDetails", + ) + + _assert_contains_in_order( + result, + ( + "Attempt 3", + "BASE", + "All changes must keep the project in a state where all tests pass.", + "## Refactoring Taste", + "## Target Files", + "## Scope", + "## Validation", + "## Retry Context", + "Use this as focused context only. Do not copy raw failure text into code.", + "## Optional Amendment", + ), + ) + + def test_compose_full_prompt_omits_blank_retry_context() -> None: result = compose_full_prompt( base_prompt="base", From f90544212ae3c64a3195b2683a3ce5103c05baa6 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 01:13:56 -0700 Subject: [PATCH 39/41] fix migration? --- .../manifest.json | 8 ++-- ...se-4-final-verification-and-review-note.md | 40 +++++++++++++++++++ 2 files changed, 44 insertions(+), 4 deletions(-) diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json index c76d8dd..e4dde3d 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json @@ -2,9 +2,9 @@ "awaiting_human_review": false, "cooldown_until": null, "created_at": "2026-05-24T00:33:18.149-07:00", - "current_phase": "final-verification-and-review-note", + "current_phase": "", "human_review_reason": null, - "last_touch": "2026-05-24T00:46:44.544-07:00", + "last_touch": "2026-05-24T01:10:19.495-07:00", "name": "src-continuous-refactoring-prompts-py-20260524T003318", "phases": [ { @@ -32,7 +32,7 @@ "required_effort": "medium" }, { - "done": false, + "done": true, "effort_reason": null, "file": "phase-4-final-verification-and-review-note.md", "name": "final-verification-and-review-note", @@ -40,6 +40,6 @@ "required_effort": null } ], - "status": "ready", + "status": "done", "wake_up_on": null } diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md index bae8739..f2e74c5 100644 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md +++ b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md @@ -18,6 +18,46 @@ Perform final verification and produce a concise review note confirming contract - If no interface change occurred, explicitly state that in the review note. - If interface change occurred, name exact behavior/text-shape deltas and rationale. +## Interface-Sensitive Wording Delta Enumeration + +No intentional interface-level prompt contract change was introduced. + +Reviewed potential prompt-surface deltas from Phase 2: +- `compose_full_prompt()` now uses `_validation_section()` and + `_with_retry_context()`, but still renders the same `## Validation` block and + keeps retry context before the optional amendment. +- Classifier, scope-selection, planning, phase-ready, phase-execution, and + migration-review prompts now use `_taste_section()` where they already used a + `## Taste` section; the section heading and injected taste body are preserved. +- `compose_full_prompt()` still uses `## Refactoring Taste`; that separate + heading was intentionally not routed through `_taste_section()`. +- Phase execution now uses `_validation_section()` and `_with_retry_context()`, + preserving the rendered validation and retry-context contract shape. +- Migration review now uses `_WORKSPACE_MUTATION_CONTRACT_LINES`, preserving the + exact staged/work/live directory mutation guard lines. + +Reviewed potential test-surface deltas from Phase 3: +- Output-contract checks were consolidated through helper assertions only. +- New assertions lock section ordering for `compose_full_prompt()` and preserve + the Phase 1 contract inventory's sensitive anchors. +- No production prompt text is changed by Phase 3. + +Contract-sensitive anchors confirmed as preserved: +- `BEGIN_CONTINUOUS_REFACTORING_STATUS` +- `END_CONTINUOUS_REFACTORING_STATUS` +- `decision: cohesive-cleanup` / `decision: needs-plan` +- `ready: yes` / `ready: no` / `ready: unverifiable` +- `final-decision: approve-auto` / `approve-needs-human` / `reject` +- `## Taste`, `## Refactoring Taste`, `## Validation`, and `## Retry Context` +- `Do not mutate the live migration directory.` + +## Final Review Note + +No human-review-triggering prompt wording delta remains. The migration changed +internal prompt assembly and prompt-test structure while preserving the rendered +contract points identified in Phase 1. Phase 4 can complete without escalating +for human review if the configured validation command passes. + ## Validation - Run full suite: - `uv run pytest` From 5cabd39fe6ae19e93b8226d57467a641ad3c2a97 Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 21:50:03 -0700 Subject: [PATCH 40/41] completed migrations ? --- .../.planning/stages/approaches.stdout.md | 7 -- .../.planning/stages/expand.stdout.md | 14 --- .../.planning/stages/final-review.stdout.md | 11 --- .../.planning/stages/pick-best.stdout.md | 13 --- .../.planning/stages/review.stdout.md | 1 - .../.planning/state.json | 69 -------------- .../approaches/composable-prompt-builders.md | 37 -------- .../approaches/prompt-surface-hardening.md | 34 ------- .../approaches/split-prompts-by-domain.md | 37 -------- .../manifest.json | 45 ---------- .../phase-1-contract-inventory.md | 89 ------------------- .../phase-2-contract-consolidation.md | 35 -------- .../phase-3-test-hardening.md | 35 -------- ...se-4-final-verification-and-review-note.md | 72 --------------- .../plan.md | 51 ----------- 15 files changed, 550 deletions(-) delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md delete mode 100644 migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md deleted file mode 100644 index 81b8404..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md +++ /dev/null @@ -1,7 +0,0 @@ -Created 3 approach files in the planning workspace: - -- `approaches/prompt-surface-hardening.md` -- `approaches/composable-prompt-builders.md` -- `approaches/split-prompts-by-domain.md` - -Each includes strategy, tradeoffs, phased plan with `required_effort`, and risk profile, with interface-change human-review triggers called out explicitly. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md deleted file mode 100644 index 96eaa43..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md +++ /dev/null @@ -1,14 +0,0 @@ -Created the migration planning artifacts in the staged workspace: - -- `plan.md` -- `phase-1-contract-inventory.md` -- `phase-2-contract-consolidation.md` -- `phase-3-test-hardening.md` -- `phase-4-final-verification-and-review-note.md` - -What’s included: -- Numbered phases with strict ordering and risk-reduction flow. -- Explicit dependency graph and per-phase dependencies in `plan.md`. -- Each phase has clear scope, `## Precondition`, `## Definition of Done`, and validation steps. -- `required_effort`/`effort_reason` only on phases that need it (Phases 2 and 3, both `medium`). -- Preconditions are phase-local and do not restate baseline-green invariants. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md deleted file mode 100644 index 66d02d0..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md +++ /dev/null @@ -1,11 +0,0 @@ -Assessment: -- Safe for automatic execution: Yes, with guardrails already encoded. The phases are incremental, dependencies are linear, and risky wording drift is explicitly constrained and tested. -- Human judgment checkpoints: Conditional only. The plan correctly routes intentional interface-sensitive wording changes to human review in Phase 4; otherwise it can run unattended. -- Fundamental flaws: None found. - -Gate checks: -- Effort labels are lowest safe tiers: `low` (inventory/final verify), `medium` (consolidation/test hardening) are appropriate; no phase is over-labeled. -- Preconditions are valid: none require baseline-green or fresh validation evidence as a precondition; validation is in Definition of Done where it belongs. -- Automation safety: strong, because contract inventory precedes edits, tests are hardened before final verification, and full `uv run pytest` is required per phase completion. - -final-decision: approve-auto — phased, low-blast-radius plan with correct effort tiers and no invalid precondition coupling to baseline/validation evidence diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md deleted file mode 100644 index a19b1b8..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md +++ /dev/null @@ -1,13 +0,0 @@ -**Chosen approach: `prompt-surface-hardening`** - -Best balance is clear: it’s the lowest-risk path that still improves maintainability and verifiability in a load-bearing prompt module. - -Why this wins: -1. **Risk control:** `prompts.py` is contract-sensitive (tests, parsing, CLI/operator expectations). This approach minimizes blast radius and keeps behavior stable. -2. **Incremental verifiability:** each phase has tight checkpoints (inventory → consolidate internals → harden invariants → full suite), so regressions show up early. -3. **Clarity without over-architecture:** it reduces repetition via small helpers/anchors, matching taste guidance to prefer small abstractions and avoid speculative restructuring. -4. **Interface safety:** it explicitly preserves exports and semantics unless intentionally changed and surfaced for review, aligned with your “human-review for released interfaces” rule. - -Runner-ups (short): -- `composable-prompt-builders`: good design direction, but more churn and parity risk than needed for this migration. -- `split-prompts-by-domain`: strongest structural cleanup, but high churn/high coordination risk; wrong move unless we’re intentionally paying that cost now. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md deleted file mode 100644 index 9e06085..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md +++ /dev/null @@ -1 +0,0 @@ -1. no findings. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json deleted file mode 100644 index 5293eb5..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/state.json +++ /dev/null @@ -1,69 +0,0 @@ -{ - "completed_steps": [ - { - "agent": "codex", - "completed_at": "2026-05-24T00:34:20.118-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "approaches", - "outcome": "completed", - "outputs": { - "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/approaches.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-24T00:34:33.019-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "pick-best", - "outcome": "completed", - "outputs": { - "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/pick-best.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-24T00:35:45.784-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "expand", - "outcome": "completed", - "outputs": { - "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/expand.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-24T00:36:25.825-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "review", - "outcome": "clear", - "outputs": { - "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/review.stdout.md" - } - }, - { - "agent": "codex", - "completed_at": "2026-05-24T00:37:03.286-07:00", - "effort": "low", - "model": "gpt-5.3-codex", - "name": "final-review", - "outcome": "approve-auto", - "outputs": { - "stdout": "migrations/src-continuous-refactoring-prompts-py-20260524T003318/.planning/stages/final-review.stdout.md" - } - } - ], - "feedback": [], - "final_decision": "approve-auto", - "final_reason": "phased, low-blast-radius plan with correct effort tiers and no invalid precondition coupling to baseline/validation evidence", - "next_step": "terminal-ready", - "review_findings": null, - "revision_base_step_counts": [], - "schema_version": 1, - "started_at": "2026-05-24T00:33:18.149-07:00", - "target": "src/continuous_refactoring/prompts.py", - "updated_at": "2026-05-24T00:37:03.286-07:00" -} diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md deleted file mode 100644 index 40ed306..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/composable-prompt-builders.md +++ /dev/null @@ -1,37 +0,0 @@ -# Approach: Composable Prompt Builders - -## Strategy -Refactor `prompts.py` into clearer domain-focused builder sections (refactor/run prompts, planning prompts, migration review prompts, taste prompts), using typed intermediate builders that compose sections deterministically. Keep public function signatures and exported constants stable. - -## What Changes -- Introduce explicit section-builder helpers per prompt family. -- Normalize shared context rendering (taste block, work-dir/live-dir constraints, retry context) through one composition path. -- Reduce ad-hoc string concatenation in top-level composition functions. -- Update tests to validate both contract text and cross-family consistency. - -## Tradeoffs -- Pros: Better readability and change locality; easier future prompt edits with less accidental divergence. -- Cons: Moderate churn in a high-touch module; requires careful parity checks across many callers/tests. - -## Estimated Phases -1. Prompt family map + invariants spec - - Scope: identify families, shared clauses, and must-not-change boundary behavior. - - required_effort: `medium` -2. Introduce composable builders behind existing APIs - - Scope: refactor internals while preserving `__all__` exports and call contracts. - - required_effort: `high` -3. Consistency unification - - Scope: route duplicated context sections through shared builder utilities; remove dead local formatting paths. - - required_effort: `medium` -4. Test adaptation + parity checks - - Scope: keep existing assertions green; add parity checks for critical prompts and status block structure. - - required_effort: `medium` -5. Full validation + migration notes - - Scope: run full suite and surface interface-impact statement for human review if wording changed. - - required_effort: `low` - -## Risk Profile -- Overall risk: Medium. -- Primary risks: subtle text-order changes impacting tests or downstream parsing behavior. -- Mitigations: golden/parity assertions for high-risk prompts; incremental refactor by family; keep public exports unchanged. -- Human-review triggers: any changed clause that could alter operator expectations in planning/review flows. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md deleted file mode 100644 index a77d166..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/prompt-surface-hardening.md +++ /dev/null @@ -1,34 +0,0 @@ -# Approach: Prompt Surface Hardening - -## Strategy -Keep `src/continuous_refactoring/prompts.py` as one module, but tighten its public contract and reduce regression risk by converting brittle literal checks into explicit, reusable output-contract anchors. Preserve all current CLI/runtime behavior and prompt semantics. - -## What Changes -- Add/normalize named prompt contract anchors (small constants/functions) for repeated required clauses. -- Replace duplicated inline phrase assembly with small local helpers where repetition is currently high. -- Keep all exported names stable unless dead exports are proven unused by repo-wide evidence. -- Expand `tests/test_prompts.py` around invariants that currently rely on scattered literal string checks. - -## Tradeoffs -- Pros: Lowest blast radius; fastest path to cleaner maintenance; strongest behavior preservation. -- Cons: Does not materially reduce module size or conceptual coupling; still string-template heavy. - -## Estimated Phases -1. Baseline + contract inventory - - Scope: map required clauses used by tests/callers; identify repetition and dead branches. - - required_effort: `low` -2. Internal prompt-contract consolidation - - Scope: extract repeated clauses to local helpers/constants without changing final rendered text contracts. - - required_effort: `medium` -3. Test hardening for prompt invariants - - Scope: add/adjust tests to validate stable contract points and taste injection expectations. - - required_effort: `medium` -4. Final verification + review note - - Scope: run targeted prompt tests and full pytest; document any intentional text-shape change for human review if present. - - required_effort: `low` - -## Risk Profile -- Overall risk: Low. -- Primary risks: accidental wording drift in prompts that breaks downstream parsing or brittle tests. -- Mitigations: keep contract strings centralized, preserve status-block delimiters verbatim, run full `uv run pytest` before completion. -- Human-review triggers: any intentional user-visible CLI prompt wording change beyond internal deduplication. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md deleted file mode 100644 index d1f7db0..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/approaches/split-prompts-by-domain.md +++ /dev/null @@ -1,37 +0,0 @@ -# Approach: Split Prompts by Domain Modules - -## Strategy -Split `src/continuous_refactoring/prompts.py` into domain-focused modules (for example: `prompts_refactor.py`, `prompts_planning.py`, `prompts_phase.py`, `prompts_taste.py`) and keep a thin package-level `prompts.py` facade that preserves the current import surface. - -## What Changes -- Move prompt constants/composers into domain modules with explicit `__all__`. -- Keep `continuous_refactoring.prompts` as compatibility boundary for current imports. -- Remove dead helpers discovered during split. -- Expand tests to cover both package-level exports and representative domain-level behaviors. - -## Tradeoffs -- Pros: Stronger module boundaries; easier ownership and focused edits; lower per-file cognitive load. -- Cons: Highest churn; touches many imports/tests; increases coordination risk with package export uniqueness and boundary rules. - -## Estimated Phases -1. Boundary design + export contract plan - - Scope: define module split and exact preserved public API from `continuous_refactoring.prompts`. - - required_effort: `high` -2. Mechanical extraction with compatibility facade - - Scope: move code into domain modules and re-export through existing boundary. - - required_effort: `xhigh` -3. Caller/test update pass - - Scope: update internals only where beneficial; keep external import behavior stable. - - required_effort: `high` -4. Cleanup + dead path deletion - - Scope: remove obsolete helpers and duplication revealed by split. - - required_effort: `medium` -5. Full verification + interface review checkpoint - - Scope: run full suite; explicitly call out any import/behavior change requiring human decision. - - required_effort: `medium` - -## Risk Profile -- Overall risk: High. -- Primary risks: accidental API/export drift, import cycle regressions, broad test fallout. -- Mitigations: strict facade compatibility, staged extraction, explicit export tests, package import uniqueness checks. -- Human-review triggers: any change to public import paths, prompt text contracts, or CLI-facing behavior. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json deleted file mode 100644 index e4dde3d..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/manifest.json +++ /dev/null @@ -1,45 +0,0 @@ -{ - "awaiting_human_review": false, - "cooldown_until": null, - "created_at": "2026-05-24T00:33:18.149-07:00", - "current_phase": "", - "human_review_reason": null, - "last_touch": "2026-05-24T01:10:19.495-07:00", - "name": "src-continuous-refactoring-prompts-py-20260524T003318", - "phases": [ - { - "done": true, - "effort_reason": null, - "file": "phase-1-contract-inventory.md", - "name": "contract-inventory", - "precondition": "- Migration status is active for this phase and no earlier phase is incomplete. - `src/continuous_refactoring/prompts.py` and `tests/test_prompts.py` are present and readable. - No concurrent edit is in progress on the same migration workspace.", - "required_effort": null - }, - { - "done": true, - "effort_reason": "Consolidation changes many string-assembly sites where subtle wording drift could break parsing and tests.", - "file": "phase-2-contract-consolidation.md", - "name": "contract-consolidation", - "precondition": "- Phase 1 contract inventory is complete and available. - The consolidation target clauses/helpers are identified with explicit must-preserve constraints. - Current public symbols and call sites that depend on prompt text shape are identified.", - "required_effort": "medium" - }, - { - "done": true, - "effort_reason": "Requires careful test-shape redesign to reduce brittleness without weakening contract checks.", - "file": "phase-3-test-hardening.md", - "name": "test-hardening", - "precondition": "- Phase 2 consolidation is complete and prompt outputs are stable. - Invariant anchors to assert are finalized from the contract inventory. - No unresolved intentional interface-change decisions remain for this migration.", - "required_effort": "medium" - }, - { - "done": true, - "effort_reason": null, - "file": "phase-4-final-verification-and-review-note.md", - "name": "final-verification-and-review-note", - "precondition": "- Phases 1\u20133 are complete. - Consolidated prompt logic and hardened tests are merged in this migration workspace. - Any potential interface-sensitive wording deltas are enumerated.", - "required_effort": null - } - ], - "status": "done", - "wake_up_on": null -} diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md deleted file mode 100644 index 49474c7..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-1-contract-inventory.md +++ /dev/null @@ -1,89 +0,0 @@ -# Phase 1: Contract Inventory - -## Objective -Create a precise inventory of prompt contracts in `src/continuous_refactoring/prompts.py` so later consolidation work is constrained by explicit invariants. - -## Scope -- Inspect prompt builders/templates in `src/continuous_refactoring/prompts.py`. -- Map repeated required clauses and existing literal contract points. -- Identify test coverage gaps in `tests/test_prompts.py` relative to required invariants. -- No behavior changes. - -## Precondition -- Migration status is active for this phase and no earlier phase is incomplete. -- `src/continuous_refactoring/prompts.py` and `tests/test_prompts.py` are present and readable. -- No concurrent edit is in progress on the same migration workspace. - -## Implementation Notes -- Produce a short inventory artifact inside this phase document (or a linked note) listing: - - Required stable prompt sections (including `## Taste` requirements). - - Contract-sensitive delimiters/phrases used for parsing or downstream decisions. - - Repetition candidates safe for consolidation without semantic drift. - - Known risky areas where wording drift could break behavior/tests. -- Keep the inventory grounded in current code and tests, not speculative future shape. - -## Validation -- Confirm no code or behavior changed. -- Run targeted prompt tests if needed to confirm baseline understanding: - - `uv run pytest tests/test_prompts.py` - -## Definition of Done -- A concrete inventory of prompt contracts exists and is specific enough to drive Phase 2 edits. -- Repetition/consolidation candidates are identified with explicit “must-preserve” constraints. -- Any ambiguous contract points are called out for conservative handling in later phases. -- The configured validation command passes: - - `uv run pytest` - -## Contract Inventory (Phase 1 Artifact) - -### Stable sections that must be preserved -- `compose_full_prompt()` output section shape and order: - - `Attempt {n}` - - base prompt body - - `REQUIRED_PREAMBLE` literal - - `## Refactoring Taste` - - optional `## Target Files` - - optional `## Scope` - - `## Validation` - - optional retry-context triple (`## Retry Context`, retry body, warning line) - - optional fix amendment appended at end -- Taste-injected prompt constants must include both terms checked by tests: - - contains `"taste"` (case-insensitive) - - contains `"injected by the caller"` -- Planning expand/review prompts must preserve `## Precondition` vs `## Definition of Done` terminology split and explicit anti-conflation language. -- `DEFAULT_REFACTORING_PROMPT` and `PHASE_EXECUTION_PROMPT` must embed the status block delimiters. - -### Contract-sensitive delimiters and phrases -- Status block delimiters are strict literals: - - `BEGIN_CONTINUOUS_REFACTORING_STATUS` - - `END_CONTINUOUS_REFACTORING_STATUS` -- Output-contract literals used by downstream parsing/tests: - - classifier: `decision: cohesive-cleanup`, `decision: needs-plan` - - final review: `final-decision: approve-auto|approve-needs-human|reject` - - ready check: `ready: yes|no|unverifiable` -- Driver-ownership clause in default refactor prompt: - - contains `Do not create git commits yourself.` -- Phase-ready wording guardrails (fresh evidence is not a human-review blocker) are assertion-backed and must remain semantically intact. - -### Repetition candidates safe for consolidation (with must-preserve constraints) -- Repeated `"Refactoring taste is injected by the caller..."` lines across classifier/planning/phase/review prompts. - - Must preserve exact `"injected by the caller"` phrase where currently tested. -- Repeated work-dir/live-dir mutation guard lines in planning/review composed prompts. - - Must preserve: - - `Writable target: work dir only.` - - `Do not mutate the live migration directory.` -- Repeated status block contract text in refactoring/phase-execution prompts. - - Must preserve all field names and begin/end markers. -- Repeated planning artifact path references (`.planning/state.json`, `.planning/stages/`, approaches path forms). - - Safe to centralize as constants/helpers if resulting rendered text is unchanged. - -### Known risky areas for wording drift -- Contract lines with finite allowed tokens (`decision:*`, `ready:*`, `final-decision:*`) are parser/test sensitive. -- Taste mention tests are substring-based; removing `"injected by the caller"` from any `_TASTE_INJECTED_PROMPTS` member will fail tests. -- `compose_full_prompt()` section ordering is not fully snapshot-tested but is behavior-significant for readability and downstream operator expectations; treat as must-preserve in consolidation. -- `_CONTINUOUS_REFACTORING_STATUS_BLOCK` embedded guidance fields are partially assertion-backed (`commit_rationale`, `why the refactor`) and operationally load-bearing. - -### Coverage gaps to handle conservatively in Phase 2 -- No test asserts `compose_full_prompt()` section ordering or exact spacing/newline layout. -- No test asserts exact text for many untested prompt constants (e.g., interview/refine/upgrade templates) beyond indirect behavior; avoid semantic rewrites there during consolidation. -- No test directly locks helper-level behavior (`_join_sections`, `_first_scope`, `_retry_context_sections`); preserve behavior when extracting shared builders. diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md deleted file mode 100644 index 9f75708..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-2-contract-consolidation.md +++ /dev/null @@ -1,35 +0,0 @@ -# Phase 2: Contract Consolidation - -## Objective -Refactor internal prompt construction in `src/continuous_refactoring/prompts.py` to reduce duplication by introducing small local anchors/helpers while preserving rendered contract semantics. - -required_effort: medium -effort_reason: Consolidation changes many string-assembly sites where subtle wording drift could break parsing and tests. - -## Scope -- Add/normalize small local constants/helpers for repeated required clauses. -- Replace high-repetition inline phrase assembly with those helpers. -- Keep exports stable unless dead exports are proven unused and explicitly surfaced. -- Avoid module-split or architecture-level refactors. - -## Precondition -- Phase 1 contract inventory is complete and available. -- The consolidation target clauses/helpers are identified with explicit must-preserve constraints. -- Current public symbols and call sites that depend on prompt text shape are identified. - -## Implementation Notes -- Apply minimal, localized edits in `src/continuous_refactoring/prompts.py`. -- Preserve load-bearing delimiters/headers and required sections verbatim unless an intentional change is documented. -- Favor readability and short helper abstractions; avoid abstraction layers that obscure prompt outputs. - -## Validation -- Run prompt-focused tests: - - `uv run pytest tests/test_prompts.py` -- If any related tests exist for prompt parsing/decisions, run them as targeted smoke checks. - -## Definition of Done -- Repeated contract clauses are consolidated into clear local helpers/constants. -- Rendered prompt outputs preserve required semantics from Phase 1 inventory. -- Public/exported behavior remains unchanged unless explicitly documented for review. -- The configured validation command passes: - - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md deleted file mode 100644 index e3d0f46..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-3-test-hardening.md +++ /dev/null @@ -1,35 +0,0 @@ -# Phase 3: Test Hardening - -## Objective -Strengthen `tests/test_prompts.py` so core prompt invariants are explicit and resilient to safe refactors while still detecting contract drift. - -required_effort: medium -effort_reason: Requires careful test-shape redesign to reduce brittleness without weakening contract checks. - -## Scope -- Add/adjust tests for required invariant anchors identified in Phase 1. -- Ensure `## Taste` injection coverage remains explicit across required prompt templates. -- Replace scattered brittle literal checks where appropriate with focused invariant assertions. -- Keep tests outcome-focused; avoid mock-heavy tests unless boundary isolation is necessary. - -## Precondition -- Phase 2 consolidation is complete and prompt outputs are stable. -- Invariant anchors to assert are finalized from the contract inventory. -- No unresolved intentional interface-change decisions remain for this migration. - -## Implementation Notes -- Prefer assertions on stable contract points (section headers, delimiters, mandatory clauses). -- Keep tests readable and maintainable; avoid overfitting to incidental formatting. -- Preserve or improve failure clarity so regressions identify which contract was broken. - -## Validation -- Run prompt tests first: - - `uv run pytest tests/test_prompts.py` -- Run any additional directly impacted tests. - -## Definition of Done -- Tests clearly encode the required prompt invariants and catch meaningful contract regressions. -- Brittle/duplicative literal checks are reduced where they do not represent true contracts. -- Taste injection requirements remain enforced by tests. -- The configured validation command passes: - - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md deleted file mode 100644 index f2e74c5..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/phase-4-final-verification-and-review-note.md +++ /dev/null @@ -1,72 +0,0 @@ -# Phase 4: Final Verification and Review Note - -## Objective -Perform final verification and produce a concise review note confirming contract preservation or clearly surfacing any intentional interface-level wording change. - -## Scope -- Final pass on changed prompt/test files for readability and contract consistency. -- Execute full validation. -- Record review note for human review if any user-visible/interface-sensitive text shape changed intentionally. - -## Precondition -- Phases 1–3 are complete. -- Consolidated prompt logic and hardened tests are merged in this migration workspace. -- Any potential interface-sensitive wording deltas are enumerated. - -## Implementation Notes -- Validate against Phase 1 inventory and Phase 3 assertions. -- If no interface change occurred, explicitly state that in the review note. -- If interface change occurred, name exact behavior/text-shape deltas and rationale. - -## Interface-Sensitive Wording Delta Enumeration - -No intentional interface-level prompt contract change was introduced. - -Reviewed potential prompt-surface deltas from Phase 2: -- `compose_full_prompt()` now uses `_validation_section()` and - `_with_retry_context()`, but still renders the same `## Validation` block and - keeps retry context before the optional amendment. -- Classifier, scope-selection, planning, phase-ready, phase-execution, and - migration-review prompts now use `_taste_section()` where they already used a - `## Taste` section; the section heading and injected taste body are preserved. -- `compose_full_prompt()` still uses `## Refactoring Taste`; that separate - heading was intentionally not routed through `_taste_section()`. -- Phase execution now uses `_validation_section()` and `_with_retry_context()`, - preserving the rendered validation and retry-context contract shape. -- Migration review now uses `_WORKSPACE_MUTATION_CONTRACT_LINES`, preserving the - exact staged/work/live directory mutation guard lines. - -Reviewed potential test-surface deltas from Phase 3: -- Output-contract checks were consolidated through helper assertions only. -- New assertions lock section ordering for `compose_full_prompt()` and preserve - the Phase 1 contract inventory's sensitive anchors. -- No production prompt text is changed by Phase 3. - -Contract-sensitive anchors confirmed as preserved: -- `BEGIN_CONTINUOUS_REFACTORING_STATUS` -- `END_CONTINUOUS_REFACTORING_STATUS` -- `decision: cohesive-cleanup` / `decision: needs-plan` -- `ready: yes` / `ready: no` / `ready: unverifiable` -- `final-decision: approve-auto` / `approve-needs-human` / `reject` -- `## Taste`, `## Refactoring Taste`, `## Validation`, and `## Retry Context` -- `Do not mutate the live migration directory.` - -## Final Review Note - -No human-review-triggering prompt wording delta remains. The migration changed -internal prompt assembly and prompt-test structure while preserving the rendered -contract points identified in Phase 1. Phase 4 can complete without escalating -for human review if the configured validation command passes. - -## Validation -- Run full suite: - - `uv run pytest` - -## Definition of Done -- Final verification confirms the repository remains shippable after this migration. -- A review note exists that either: - - confirms no intentional interface-level prompt contract change, or - - explicitly documents each intentional interface-level change for human review. -- No open migration-phase blockers remain. -- The configured validation command passes: - - `uv run pytest` diff --git a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md b/migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md deleted file mode 100644 index e7df86a..0000000 --- a/migrations/src-continuous-refactoring-prompts-py-20260524T003318/plan.md +++ /dev/null @@ -1,51 +0,0 @@ -# Migration Plan: Prompt Surface Hardening (`src/continuous_refactoring/prompts.py`) - -## Goal -Harden the prompt surface in `src/continuous_refactoring/prompts.py` by consolidating repeated contract clauses and strengthening invariant tests, while preserving runtime/CLI behavior and package interface expectations. - -## Scope -- In scope: - - `src/continuous_refactoring/prompts.py` - - `tests/test_prompts.py` - - Related tests only when needed to prove unchanged behavior -- Out of scope: - - Splitting `prompts.py` into multiple modules - - New runtime dependencies - - Behavioral changes to CLI/system interfaces unless intentionally surfaced for review - -## Phase Plan -1. [phase-1-contract-inventory.md](phase-1-contract-inventory.md) -2. [phase-2-contract-consolidation.md](phase-2-contract-consolidation.md) -3. [phase-3-test-hardening.md](phase-3-test-hardening.md) -4. [phase-4-final-verification-and-review-note.md](phase-4-final-verification-and-review-note.md) - -## Dependency Graph -```mermaid -graph TD - P1[Phase 1: Contract Inventory] --> P2[Phase 2: Contract Consolidation] - P2 --> P3[Phase 3: Test Hardening] - P3 --> P4[Phase 4: Final Verification + Review Note] -``` - -## Phase Dependencies -- Phase 1 has no migration-phase blockers. -- Phase 2 depends on completed Phase 1 inventory outputs. -- Phase 3 depends on consolidated prompt anchors/helpers from Phase 2. -- Phase 4 depends on completed Phase 3 tests and finalized text-shape verification. - -## Validation Strategy -- Per-phase validation (independent checks): - - Each phase specifies its own targeted validation commands and evidence. - - Each phase must leave the repo in a shippable state. -- Global validation gates: - - The configured validation command (`uv run pytest`) must pass for every completed phase. -- Contract-safety checks: - - Preserve required prompt sections and status-block formatting semantics. - - Preserve `## Taste` injection behavior across all prompt templates covered by `tests/test_prompts.py`. - - Preserve package/API expectations (no unreviewed export/behavior changes). - -## Risk Controls -- Keep changes incremental and small per phase. -- Prefer local helpers/constants over structural rewrites. -- Treat any intentional user-visible prompt wording change as human-review material and document it in Phase 4. -- If any contract cannot be preserved exactly, stop and surface the specific interface delta before proceeding. From e88c84f85d4b325bb266ab2efe330949be25678d Mon Sep 17 00:00:00 2001 From: Hiren Hiranandani Date: Sun, 24 May 2026 21:51:43 -0700 Subject: [PATCH 41/41] feat: Prepare 1.0 Release Why: Requests Release Please to cut v1.0.0 from this branch once it lands on main. Validation: Not run; release metadata-only commit. Release-As: 1.0.0 Co-Authored-By: Claude