From adef85f47f99eaf5a64662733d9ca23d6aff9fe7 Mon Sep 17 00:00:00 2001 From: Vittorio Distefano <101409677+blackopsrepl@users.noreply.github.com> Date: Mon, 20 Apr 2026 14:42:52 +0200 Subject: [PATCH] docs: add full-repo repetition refactor audit --- docs/refactor-audit-2026-04-20.md | 214 ++++++++++++++++++++++++++++++ 1 file changed, 214 insertions(+) create mode 100644 docs/refactor-audit-2026-04-20.md diff --git a/docs/refactor-audit-2026-04-20.md b/docs/refactor-audit-2026-04-20.md new file mode 100644 index 00000000..1e35f136 --- /dev/null +++ b/docs/refactor-audit-2026-04-20.md @@ -0,0 +1,214 @@ +# Refactor Audit (2026-04-20) + +## Scope and method + +This audit reviewed the entire Rust workspace under `crates/**` (57,387 non-comment, non-blank LOC) and focused on repetition patterns in production code and tests. + +Method used: + +1. Workspace-wide LOC and module concentration scan. +2. Similarity scans on high-churn directories (`solverforge-solver`, `solverforge-scoring`, `solverforge-macros`). +3. Repetition hotspot review by hand in representative files. +4. Conservative LOC-saving estimates (net reduction after introducing shared abstractions). + +## Baseline footprint + +- Workspace Rust LOC baseline: **57,387**. +- Largest crates by LOC: + - `solverforge-solver`: **32,409** + - `solverforge-scoring`: **14,537** + - `solverforge-core`: **3,920** + - `solverforge-macros`: **3,378** + +These four crates make up the vast majority of repetition opportunities. + +## Repetition hotspots and proposals + +### 1) Constraint-stream arity family consolidation (scoring) + +**Hotspot evidence** + +- `bi_stream.rs` (280), `tri_stream.rs` (295), `quad_stream.rs` (238), `penta_stream.rs` (239) are structurally similar and implement near-identical fluent operations across arities. +- `existence_stream.rs` (440) and `complemented_stream.rs` (392) repeat operation plumbing with slight type-shape differences. + +**Proposal** + +Create an internal arity-kernel layer (generic traits + helper macros) that centralizes: + +- propagation wiring, +- stream builder transitions, +- collector/weighting pass-through, +- shared impl blocks for transform/filter/penalize/reward style operators. + +Then keep arity-specific files as thin type aliases + minimal ergonomics wrappers. + +**Estimated net savings**: **900 LOC**. + +**Risk**: Medium (generic complexity). Keep public API unchanged; refactor internal glue only. + +--- + +### 2) Move-selector neighborhood unification (solver) + +**Hotspot evidence** + +Large, parallel implementations across neighborhood families: + +- move layer: `list_change.rs` (281), `list_swap.rs` (268), `sublist_change.rs` (299), `sublist_swap.rs` (402), `pillar_change.rs` (176), `pillar_swap.rs` (200), `ruin.rs` (181), `list_ruin.rs` (330). +- selector layer: `list_change.rs` (246), `list_swap.rs` (277), `sublist_change.rs` (318), `sublist_swap.rs` (330), `pillar.rs` (260), `ruin.rs` (252), `list_ruin.rs` (280), `nearby_list_change.rs` (383), `nearby_list_swap.rs` (319). + +**Proposal** + +Introduce a shared neighborhood traversal kernel with static-dispatch strategy structs: + +- candidate enumeration, +- filtering predicates, +- score impact hooks, +- move materialization adapters. + +Keep concrete move/selector types for API clarity, but delegate to shared kernels. + +**Estimated net savings**: **1,050 LOC**. + +**Risk**: Medium-high (performance-sensitive). Must benchmark and preserve monomorphization. + +--- + +### 3) Solver tests: fixture and assertion DSL extraction + +**Hotspot evidence** + +Solver test surface is large and repetitive (5,031 LOC sampled across move/selector/manager/realtime tests), especially in: + +- `heuristic/move/tests/*.rs` +- `heuristic/move/*_tests.rs` +- `heuristic/selector/*_tests.rs` +- `manager/*_tests.rs` + +Repeated patterns include: + +- almost-identical entity/route setup, +- repeated selector construction boilerplate, +- repeated step-and-assert sequences. + +**Proposal** + +Build a dedicated internal test harness module: + +- fixture builders (problem graph, entities, anchors), +- one-line scenario macros for common setup paths, +- assertion helpers for deterministic move sets / score deltas. + +Migrate existing tests incrementally without reducing coverage. + +**Estimated net savings**: **1,250 LOC**. + +**Risk**: Low-medium (test-only). Highest immediate ROI. + +--- + +### 4) Manager + realtime lifecycle scenario consolidation + +**Hotspot evidence** + +Scenario-style tests in manager/realtime duplicate orchestration setup and expected transitions: + +- `manager/mod_tests.rs` (309) +- `manager/phase_factory_tests.rs` (268) +- `manager/builder_tests.rs` (96) +- `realtime/problem_change_tests.rs` (89) +- `realtime/solver_handle_tests.rs` (94) + +**Proposal** + +Add lifecycle scenario builders: + +- `given_solver_manager()` presets, +- phase-pipeline scenario templates, +- reusable assertions for pause/resume/change behavior. + +Co-locate with existing `test_utils` to avoid fragmentation. + +**Estimated net savings**: **300 LOC**. + +**Risk**: Low. + +--- + +### 5) Scoring stream builder internals: flattened/existence/complemented boilerplate reduction + +**Hotspot evidence** + +Internal boilerplate for weighting and builder transitions is repeated in: + +- `flattened_bi_stream/base.rs` (190) +- `flattened_bi_stream/builder.rs` (158) +- `flattened_bi_stream/weighting.rs` (353) +- plus repeated fragments in `existence_stream.rs` and `complemented_stream.rs`. + +**Proposal** + +Introduce private helper macros for: + +- weighted stream factory wiring, +- collector construction patterns, +- common state-carrier structs and impl templates. + +Avoid public macro exposure; keep internal to scoring crate. + +**Estimated net savings**: **420 LOC**. + +**Risk**: Medium. + +--- + +### 6) Proc-macro codegen utility extraction (macros crate) + +**Hotspot evidence** + +TokenStream assembly patterns recur across planning entity/solution codegen paths: + +- `planning_entity/expand.rs` (493) +- `planning_entity/list_variable.rs` (369) +- `planning_solution/runtime.rs` (470) +- `planning_solution/list_operations.rs` (476) +- `planning_solution/expand.rs` (159) +- `planning_solution/shadow.rs` (174) + +**Proposal** + +Create internal codegen helper modules for: + +- repeated token fragments, +- common error emission helpers, +- shared parsing + normalization adapters. + +Retain explicit public macro behavior while reducing duplicate quote! blocks. + +**Estimated net savings**: **450 LOC**. + +**Risk**: Medium (macro diagnostics quality must be preserved). + +## Estimated total reduction + +Estimated aggregate net reduction if all 6 initiatives land: + +- 900 + 1,050 + 1,250 + 300 + 420 + 450 = **4,370 LOC**. + +Relative to current Rust LOC baseline (57,387), that is approximately **7.6%** reduction. + +## Recommended execution order + +1. **Test harness extraction** (initiative 3) — fastest, lowest risk. +2. **Manager/realtime scenario consolidation** (initiative 4) — low risk, immediate cleanup. +3. **Scoring builder boilerplate reduction** (initiative 5) — contained internal refactor. +4. **Stream arity family consolidation** (initiative 1) — larger architectural cleanup. +5. **Move-selector neighborhood unification** (initiative 2) — performance-sensitive, benchmark-gated. +6. **Proc-macro utility extraction** (initiative 6) — do after other internals stabilize. + +## Guardrails + +- Preserve zero-erasure and static dispatch in hot paths. +- Avoid introducing `Arc`, `Rc`, `Box` in solver/scoring paths. +- Validate each phase with `make build`, `make test`, and targeted microbenchmarks for selector/move refactors. +- Keep wireframes/docs in sync if any public API signatures move.