From 7375d4d47d945e8fc9b0ab3a6ed0d46a8f2c10b5 Mon Sep 17 00:00:00 2001
From: Vittorio Distefano <101409677+blackopsrepl@users.noreply.github.com>
Date: Mon, 20 Apr 2026 14:42:32 +0200
Subject: [PATCH] docs: add full repo repetition refactor audit proposal

---
 docs/repo-refactor-audit.md | 195 ++++++++++++++++++++++++++++++++++++
 1 file changed, 195 insertions(+)
 create mode 100644 docs/repo-refactor-audit.md

diff --git a/docs/repo-refactor-audit.md b/docs/repo-refactor-audit.md
new file mode 100644
index 00000000..f050f1cf
--- /dev/null
+++ b/docs/repo-refactor-audit.md
@@ -0,0 +1,195 @@
+# Repository Refactor Audit: Repetition Reduction Plan
+
+Date: 2026-04-20
+
+## Scope and method
+
+I ran a structural duplication audit across the Rust workspace focused on:
+
+1. near-identical test modules,
+2. arity-specialized code generation hand-written across multiple files,
+3. repeated builder/selector patterns that can be collapsed via typed helpers.
+
+Sampling and metrics used:
+
+- pairwise similarity over normalized non-comment lines,
+- family-level LOC totals (non-empty, non-comment),
+- conservative savings estimate that assumes we keep typed/monomorphized hot paths.
+
+The goal is **deleting repeated source**, not adding abstraction that harms runtime characteristics.
+
+---
+
+## Highest-yield refactor opportunities
+
+## 1) Remove duplicated test suites in `solverforge-solver` move selectors
+
+### Evidence
+
+The following pairs are nearly identical:
+
+- `heuristic/move/list_change_tests.rs` vs `heuristic/move/tests/list_change.rs` (~0.94 similarity),
+- `heuristic/move/sublist_change_tests.rs` vs `heuristic/move/tests/sublist_change.rs` (~0.94 similarity),
+- `heuristic/move/sublist_swap_tests.rs` vs `heuristic/move/tests/sublist_swap.rs` (~0.95 similarity).
+
+Family size today: **1202 LOC** across 6 files.
+
+### Proposal
+
+Pick one canonical organization for move tests and remove the parallel duplicate tree:
+
+- keep `heuristic/move/tests/*.rs` and delete the `*_tests.rs` duplicates, or
+- keep `*_tests.rs` and delete `heuristic/move/tests/*.rs`.
+
+Then centralize shared fixtures in one helper module (`tests/common.rs`) so additional move tests stop copying setup code.
+
+### Estimated savings
+
+- Direct duplicate deletion: **~592 LOC** (minimum-side deletion from the 3 strongest duplicate pairs).
+- Fixture extraction and de-dup inside kept tests: **~80 LOC**.
+
+**Subtotal: ~672 LOC saved.**
+
+---
+
+## 2) Remove duplicated test suites in `solverforge-solver` selectors
+
+### Evidence
+
+Strong duplicated pairs:
+
+- `heuristic/selector/pillar_tests.rs` vs `heuristic/selector/tests/pillar.rs` (~0.95 similarity),
+- `heuristic/selector/k_opt/tests.rs` vs `heuristic/selector/tests/k_opt.rs` (~0.94 similarity).
+
+Plus partial duplication:
+
+- `heuristic/selector/move_selector_tests.rs` vs `heuristic/selector/tests/move_selector.rs` (~0.61 similarity).
+
+Family size today: **1138 LOC** across 6 files.
+
+### Proposal
+
+- Collapse to a single selector test tree (prefer `heuristic/selector/tests/*`).
+- Split reusable cases into table-driven helper functions to avoid same assertions repeated per selector type.
+- Keep only one integration-style `move_selector` suite and derive specific variants from shared harness inputs.
+
+### Estimated savings
+
+- Direct file-level duplicate removal: **~386 LOC**.
+- Partial merge in `move_selector*` test logic: **~120 LOC**.
+
+**Subtotal: ~506 LOC saved.**
+
+---
+
+## 3) Replace hand-expanded n-ary stream macro files with one arity template
+
+### Evidence
+
+`solverforge-scoring/src/stream/arity_stream_macros/nary_stream/` has 4 large files (`bi`, `tri`, `quad`, `penta`) with heavy structural repetition.
+
+Family size today: **1243 LOC** across 4 files.
+
+### Proposal
+
+Introduce a single internal meta-macro for arity patterns and keep per-arity files as tiny declarative invocations only (or fold into one file with declarative entries).
+
+Constraints to preserve:
+
+- zero-erasure hot path,
+- concrete type propagation,
+- no `dyn`/boxing in stream execution path.
+
+### Estimated savings
+
+- compress repeated boilerplate while retaining typed expansions: **~640 LOC**.
+
+**Subtotal: ~640 LOC saved.**
+
+---
+
+## 4) Collapse higher-arity incremental constraints into generic scaffolding
+
+### Evidence
+
+`solverforge-scoring/src/constraint/nary_incremental/higher_arity/{tri,quad,penta}.rs` are strongly similar (0.63–0.68).
+
+Family size today: **990 LOC** across 3 files.
+
+### Proposal
+
+Create one typed scaffolding layer for shared incremental lifecycle steps and keep arity-specific type bindings as thin wrappers.
+
+Good fit:
+
+- shared tuple update bookkeeping,
+- repeated insert/retract dispatch structure,
+- repeated collector wiring.
+
+### Estimated savings
+
+- factor shared mechanics once, preserve arity entrypoints: **~430 LOC**.
+
+**Subtotal: ~430 LOC saved.**
+
+---
+
+## 5) Normalize repeated count-based termination implementations
+
+### Evidence
+
+`termination/move_count.rs` and `termination/score_calculation_count.rs` share substantial shape (similarity ~0.61) and follow the same state/check lifecycle structure.
+
+### Proposal
+
+Create a small generic counter-termination primitive with typed labels/config wrappers for each public termination mode.
+
+### Estimated savings
+
+**~90 LOC saved.**
+
+---
+
+## Aggregate LOC savings
+
+Conservative estimate from the five tracks above:
+
+- Move test duplication cleanup: **~672 LOC**
+- Selector test duplication cleanup: **~506 LOC**
+- N-ary stream macro consolidation: **~640 LOC**
+- Higher-arity incremental scaffolding: **~430 LOC**
+- Termination counter normalization: **~90 LOC**
+
+## **Total estimated savings: ~2338 LOC**
+
+A realistic delivery range is **~2.1k to ~2.6k LOC** depending on how aggressively we unify test harnesses.
+
+---
+
+## Implementation sequence (recommended)
+
+1. **Test de-dup first** (tracks 1 and 2)
+   - low product risk,
+   - immediate LOC reduction,
+   - faster CI compile/test feedback afterward.
+
+2. **N-ary stream macro consolidation** (track 3)
+   - highest maintenance burden reduction,
+   - can be validated with existing stream tests.
+
+3. **Higher-arity incremental scaffolding** (track 4)
+   - perform after stream macro cleanup to avoid overlapping churn.
+
+4. **Termination counter normalization** (track 5)
+   - quick cleanup pass at end.
+
+---
+
+## Guardrails for the refactor
+
+To keep behavior and performance stable while doing ground-up cleanup:
+
+- preserve zero-erasure boundaries and concrete typing in solver/scoring hot paths,
+- do not introduce trait-object indirection in frequently executed move/score loops,
+- verify with `make test` and targeted benchmark tests around scoring/director modules,
+- keep public API unchanged unless a follow-up API review is explicitly approved.