feat(aggregator): surgical edits when only some findings change (Plan B) by codeninja · Pull Request #17 · codeninja/wikifi

codeninja · 2026-05-04T17:10:39Z

Summary

Stacked on feat(cache): incremental persistence + full-cache-hit short-circuit (Plan A) #16 (Plan A). Re-target to main after Plan A lands.
Adds a third aggregation path between full cache hit and full rewrite: when a section's finding set overlaps the cached set within a tunable threshold, the LLM is asked to edit the cached body in place around the added/removed delta.
Preserves established prose. Cuts tokens on partial-change walks. Addresses the user's "potentially omitting or changing key details" concern by making the cached body the explicit anchor instead of re-deriving from scratch.
Cache schema v2 → v3 (adds finding_ids to CachedSection).

Plan: docs/plan-b-surgical-aggregation.md

Decision tree

Per primary section, in priority order:

Cache hit — notes_hash matches cached entry. Re-render cached bundle, no LLM call. (Plan A.)
Unchanged finding ids — notes_hash differs but ids match (line range moved, summary changed). Refresh cache key, re-render cached body, no LLM call.
Surgical — symmetric difference of finding ids ≤ surgical_edit_threshold (default 0.3). Send cached body + delta → LLM. Merge edit with cached claims (drop removed, add new resolved against added notes). Contradictions are fully replaced.
Rewrite — too much churn or no prior cached body. Original from-scratch aggregation. (Plan A.)

Identity model

finding_id = sha256(file_path::section_id::finding_text)[:16]. Stable across walks; reword of finding text → new id (semantically delete-plus-insert, since prose written around the old wording can no longer be assumed grounded by the new wording).

CachedSection.finding_ids: list[str] is aligned with note order so a 1-based claim.source_indices entry maps to finding_ids[idx-1].

What's new

New module wikifi/surgical.py — SectionChange, classify_section_change, SurgicalEdit Pydantic schema, SURGICAL_SYSTEM_PROMPT, surgical_aggregate (LLM call + claim re-anchoring).
wikifi/cache.py — compute_finding_id, note_finding_ids, CachedSection.finding_ids, schema bump.
wikifi/aggregator.py — decision tree with 4 paths; AggregationStats adds sections_edited + sections_rewritten.
wikifi/config.py — Settings.use_surgical_edits + Settings.surgical_edit_threshold.
wikifi/cli.py — wikifi walk --no-surgical opt-out flag; walk report shows new columns.
wikifi/orchestrator.py — passes the new settings through to aggregate_all.

Test plan

make test — 231 passed (was 212), 93% coverage
make lint — clean
New tests in tests/test_surgical.py:
- compute_finding_id / note_finding_ids determinism
- classifier truth table — no cache, legacy v2 (no finding_ids), identical sets, small delta → surgical, large delta → rewrite, threshold exactly equal → surgical (inclusive), disabled threshold → rewrite
- surgical_aggregate — claim merge keeps cached minus removed, adds new resolved against added notes; bad indices ignored; contradictions replaced
- aggregator records finding_ids on every cache write
- unchanged-finding-ids path takes no LLM call and refreshes notes_hash
- surgical path selected for small delta; rewrite for large delta
- use_surgical_edits=False collapses to Plan A's two-path tree
- surgical-path LLM failure falls back cleanly to rewrite (section is never empty)
- stability: when LLM honors the contract, unchanged paragraphs are byte-preserved through the merge — guards the mechanism (no rendering pass mangles the unchanged region); real-world prose stability depends on prompt quality
Manual: make walk → edit one source file that contributes a finding to one section → make walk → verify only that section is edited (sections_edited=1) and the rest of the wiki is byte-stable

Risk register

The model may not actually preserve unchanged paragraphs in every real-world case. The mocked stability test guards the mechanism but not the prompt quality. Mitigations:
- Tight prompt instructing verbatim preservation, with input body fenced in markdown
- --no-surgical opt-out for users who'd rather have full rewrites
- Threshold default 0.3 keeps surgical to genuinely-small deltas where preservation is the easy default
- (Future) post-edit critic that diffs edited body against cached body and rejects edits whose drift exceeds the delta size
Cache v2 → v3 invalidates v2 entries (one-time re-aggregation on upgrade). Same upgrade pattern as Plan A.

🤖 Generated with Claude Code

Copilot

Pull request overview

This PR adds a “Plan B” aggregation mode that can re-use cached section prose when only a small subset of a section’s findings changed, by asking the LLM to edit the cached body in place and then merging the structured evidence (claims/contradictions) with the cached evidence.

Changes:

Introduces a new surgical-edit aggregation path (unchanged / surgical edit / rewrite classification) and wires it into the stage-3 aggregator.
Extends the aggregation cache schema to store stable per-note finding_ids and bumps cache version to v3.
Adds settings + CLI flag to enable/disable surgical edits and tunes reporting/tests accordingly.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`wikifi/surgical.py`	New surgical edit classifier + prompt/schema + merge logic for edited bodies and evidence.
`wikifi/aggregator.py`	Implements the 4-path decision tree (cache hit / unchanged ids / surgical / rewrite) and records new stats + finding_ids in cache.
`wikifi/cache.py`	Cache schema v3 (`CachedSection.finding_ids`), plus helpers to compute and align finding ids per note.
`wikifi/config.py`	Adds `use_surgical_edits` and `surgical_edit_threshold` settings.
`wikifi/cli.py`	Adds `--no-surgical` opt-out and extends walk output with surgical-related stats.
`wikifi/orchestrator.py`	Threads surgical settings through to `aggregate_all`.
`tests/test_surgical.py`	New unit tests covering finding id determinism, classifier routing, surgical merge behavior, and aggregator path selection.
`.wikifi/*.md`	Regenerated wiki content reflecting the new behavior and updated narratives; `diagrams.md` currently includes a derivation error artifact.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def compute_finding_id(*, file: str, section_id: str, finding: str) -> str:
+    """Stable identity for one note across walks.
+
+    Composition: ``sha256(file + "::" + section_id + "::" + finding)``,
+    truncated to 16 hex chars. Two walks that emit the same finding
+    text from the same file targeting the same section produce the
+    same id; any change in any of the three components is a fresh id.
+
+    A reword of ``finding`` (even one character) gets a new id, which
+    semantically counts as "removed-and-added" from the surgical
+    aggregator's point of view — the same as a delete-plus-insert in a
+    text diff. That's intentional: a paragraph the cached body wrote
+    around the old wording can no longer be assumed to still be
+    grounded by the new wording.
+    """
+    from wikifi.fingerprint import hash_text
+
+    return hash_text(f"{file}::{section_id}::{finding}")


+        compute_finding_id(
+            file=str(n.get("file", "")),
+            section_id=section_id,
+            finding=str(n.get("finding", "")),
+        )
+        for n in notes


+    collapses the decision tree back to the Plan A two-path version
+    (cache-hit or full rewrite).


+        # Path 2: finding ids unchanged but notes_hash differs (e.g. line
+        # ranges shifted). The cached body is still grounded; just
+        # refresh the cache key and re-render.
+        if change.decision == "unchanged" and cached_entry is not None:
+            bundle = EvidenceBundle(
+                body=cached_entry.body,
+                claims=[Claim.model_validate(c) for c in cached_entry.claims],
+                contradictions=[Contradiction.model_validate(c) for c in cached_entry.contradictions],
            )
-            bundle = _bundle_from(structured, notes)
-            rendered = render_section_body(bundle)
-        except Exception as exc:
-            log.warning("aggregation failed for %s: %s", section.id, exc)
-            rendered = _fallback_body(section, notes, error=str(exc))
-            bundle = None
-
-        write_section(layout, section, rendered)
+            write_section(layout, section, render_section_body(bundle))
+            stats.sections_cached += 1
+            stats.sections_written += 1
+            if cache is not None:
+                cache.record_aggregation(
+                    section.id,
+                    notes_hash=notes_hash,
+                    body=cached_entry.body,
+                    claims=cached_entry.claims,
+                    contradictions=cached_entry.contradictions,
+                    finding_ids=live_finding_ids,
+                )
+                if persist_cache is not None:
+                    persist_cache()
+            continue


 # Diagrams

-The three diagrams below cover the subdomain structure of the core domain, the structural relationships between the principal entities, and the end-to-end pipeline integration flow.
-
---
-
-## Domain Map
-
-The four subdomains that compose the core domain of automated documentation synthesis form a strict directed dependency chain. No subdomain reaches backward; the ordering is a first-class design constraint.
-
-```mermaid
-graph LR
-    RI["Repository Introspection"]
-    KE["Knowledge Extraction"]
-    SS_P["Section Synthesis - Primary"]
-    SS_D["Section Synthesis - Derivative"]
-    AP_W[("Working-State Persistence")]
-    AP_C[("Committed Wiki Persistence")]
-
-    RI --> KE
-    KE --> AP_W
-    KE --> SS_P
-    SS_P --> SS_D
-    SS_P --> AP_C
-    SS_D --> AP_C
-```
-
---
-
-## Entity Relationship View
-
-The diagram spans six concern areas: wiki structure, evidence tracing, extraction, pipeline orchestration, caching, and quality review. The self-referential edge on **Section** represents the directed acyclic graph of upstream declarations enforced by topological ordering at startup.
-
-```mermaid
-erDiagram
-    SECTION ||--o{ SECTION : "upstream-of"
-    SECTION ||--o{ LOADED_SECTION : "has loaded form"
-    SECTION ||--o{ SECTION_REPORT : "reported by"
-    WIKI_REPORT ||--o{ SECTION_REPORT : aggregates
-    EVIDENCE_BUNDLE ||--o{ CLAIM : contains
-    EVIDENCE_BUNDLE ||--o{ CONTRADICTION : contains
-    CLAIM ||--o{ SOURCE_REF : "backed by"
-    CONTRADICTION ||--|{ CLAIM : groups
-    FILE_FINDINGS ||--|{ SECTION_FINDING : groups
-    WALK_REPORT ||--|| INTROSPECTION_RESULT : carries
-    WALK_REPORT ||--|| WALK_CACHE : carries
-    WALK_REPORT ||--|| REPO_GRAPH : carries
-    REPO_GRAPH ||--o{ GRAPH_NODE : contains
-    WALK_CACHE ||--o{ CACHED_FINDINGS : stores
-    WALK_CACHE ||--o{ CACHED_SECTION : stores
-    WIKI_QUALITY_REPORT ||--o{ CRITIQUE : maps
-    REVIEW_OUTCOME ||--|{ CRITIQUE : tracks
-    CHAT_SESSION ||--o{ CHAT_MESSAGE : history
-    CHAT_SESSION ||--|| LLM_PROVIDER : uses
-```
-
---
-
-## Pipeline Integration Flow
-
-The sequence below traces a complete pipeline run triggered by the `walk` command. The provider abstraction is the sole contact point for all inference calls; the filesystem layout abstraction mediates all reads and writes; the cache layer short-circuits calls when prior results remain valid. Derivative sections are excluded from the aggregation stage and are handled exclusively in the derivation stage.
-
-```mermaid
-sequenceDiagram
-    actor Operator
-    participant CLI
-    participant Orchestrator
-    participant LLMProvider as LLM Provider
-    participant ImportGraph as Import Graph
-    participant Cache
-    participant Layout as Filesystem Layout
-
-    Operator->>CLI: walk command
-    CLI->>Orchestrator: initialise pipeline
-
-    Note over Orchestrator,LLMProvider: Stage 1 - Repository Introspection
-    Orchestrator->>LLMProvider: structured completion (classify repository paths)
-    LLMProvider-->>Orchestrator: IntrospectionResult
-
-    Note over Orchestrator,ImportGraph: Build Cross-File Import Graph
-    Orchestrator->>ImportGraph: traverse in-scope files
-    ImportGraph-->>Orchestrator: RepoGraph ready
-
-    Note over Orchestrator,Layout: Stage 2 - Knowledge Extraction (per file)
-    loop each in-scope file
-        Orchestrator->>Cache: check content fingerprint
-        Cache-->>Orchestrator: hit or miss
-        Orchestrator->>ImportGraph: fetch neighbour paths
-        Orchestrator->>LLMProvider: structured completion (findings schema)
-        LLMProvider-->>Orchestrator: FileFindings
-        Orchestrator->>Layout: append notes
-    end
-
-    Note over Orchestrator,Layout: Stage 3 - Section Aggregation (primary sections only)
-    loop each primary section
-        Orchestrator->>Cache: check notes-payload hash
-        Cache-->>Orchestrator: hit or miss
-        Orchestrator->>LLMProvider: structured completion (section-body schema)
-        LLMProvider-->>Orchestrator: EvidenceBundle
-        Orchestrator->>Layout: write section markdown
-    end
-
-    Note over Orchestrator,Layout: Stage 4 - Section Derivation (topological order)
-    loop each derivative section
-        Orchestrator->>Layout: read upstream section bodies
-        Orchestrator->>LLMProvider: structured completion (synthesis)
-        LLMProvider-->>Orchestrator: section body
-        Orchestrator->>LLMProvider: structured completion (quality critique)
-        LLMProvider-->>Orchestrator: Critique
-        Orchestrator->>Layout: write section markdown
-    end
-
-    Orchestrator-->>CLI: WalkReport
-    CLI-->>Operator: run summary
-```
-
-> **External capability providers**: The system is additionally configured as a client that fans out to multiple external capability providers via a tool-server protocol — a local AI utility, a local web crawler, a remote documentation context service, and a remote stitching/search service. The upstream sections do not specify at which pipeline call sites these providers are invoked.
+_Derivation failed for **Diagrams** (1 validation error for DerivedSection
+  Invalid JSON: EOF while parsing a string at line 1 column 4334 [type=json_invalid, input_value='{"body": "Three diagrams... INTRO[Introspection]\\', input_type=str]
+    For further information visit https://errors.pydantic.dev/2.13/v/json_invalid). Upstream evidence preserved below._
+
+> Brief: Mermaid diagrams that visualize structural and behavioral relationships across the system: a domain map (graph or classDiagram across domains), an entity relationship view (erDiagram across entities), and an integration flow (sequence or flowchart across integrations). Tech-agnostic — no reference to current stack.
+


Building on Plan A's cache layer, the aggregator gains a third path between full cache hit and full rewrite. When a section's finding set overlaps the cached set within a configurable threshold, the LLM is asked to *edit* the cached body around the added/removed delta rather than re-derive the section from scratch. Preserves established prose, addresses the "potentially omitting or changing key details" concern, and cuts tokens on partially-changed walks. Per-section decision tree (in priority order): 1. **Cache hit** — notes_hash matches cached entry exactly. Re-render cached bundle, no LLM call. (Plan A behavior, unchanged.) 2. **Unchanged** — finding ids match but notes_hash differs (a source line shifted, summary changed, etc.). Refresh the cache key, re- render the cached body, no LLM call. 3. **Surgical** — symmetric difference of finding ids is at or below `surgical_edit_threshold` (default 0.3). Send cached body + delta to LLM, merge edit with cached claims (drop removed, add new). 4. **Rewrite** — too much churn (or no prior body to anchor on). Run the original from-scratch aggregation prompt. (Plan A behavior.) Identity: each note gets a stable `finding_id = sha256(file::section_id::finding_text)[:16]`. The cache stores `finding_ids` aligned with note order so claim source_indices map back to ids. Reword of finding text → new id (semantically a delete-plus-insert, since prose written around the old wording can no longer be assumed grounded). Citation re-anchoring is mechanical: cached claims minus `removed_claim_indices` plus `new_claims` resolved against the added notes only. Contradictions are fully replaced from the model output since they're a small set and cleaner to re-emit than diff. New module `wikifi/surgical.py`: - `compute_finding_id` / `note_finding_ids` (in cache.py — adjacent to the other hash helpers) - `SectionChange` dataclass + `classify_section_change` for the diff - `SurgicalEdit` Pydantic schema + `SURGICAL_SYSTEM_PROMPT` - `surgical_aggregate` runs the LLM call and merges the result Cache schema bumped v2 → v3 to add `finding_ids: list[str]` to `CachedSection`. v2 entries load to empty so an upgraded wiki re-aggregates on the first walk; subsequent walks light up surgical. Configuration: - `Settings.use_surgical_edits: bool = True` - `Settings.surgical_edit_threshold: float = 0.3` - `wikifi walk --no-surgical` opt-out flag Walk report adds `sections_edited` and `sections_rewritten` columns alongside the existing `sections_cached`. Tests: 231 passing, 93% coverage. New `tests/test_surgical.py` covers: - `compute_finding_id` / `note_finding_ids` determinism - classifier truth table (no-cache, legacy v2, identical sets, small delta, large delta, threshold exactly equal, disabled threshold) - `surgical_aggregate` claim merging (drop removed, resolve new against added notes, replace contradictions, ignore bad indices) - aggregator decision tree records `finding_ids` on every cache write - unchanged-finding-ids path takes no LLM call and refreshes notes_hash - surgical path selected for small delta; rewrite for large delta - `use_surgical_edits=False` collapses to Plan A's two-path tree - surgical-path LLM failure falls back to rewrite cleanly - stability: when LLM honors the contract, unchanged paragraphs are byte-preserved through the merge Plan documented in `docs/plan-b-surgical-aggregation.md`. Stacked on Plan A (#16). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

First walk under cache schema v3 — the v2 cache from prior walks invalidates, so every section took the rewrite path (sections_rewritten=8, sections_edited=0). Subsequent walks with small per-file changes will exercise the surgical edit path. Walk stats: - Stage 1 introspection: include=1 exclude=12 langs=Python - Stage 2 extraction: 31 files seen, 29 contributed, 277 findings - Stage 3 aggregation: 8 sections written (all rewrite path) - Stage 4 derivation: personas + user_stories OK; diagrams hit a json_invalid response from the model (long mermaid output truncated by the SDK's parse layer) and fell back to upstream-bodies-preserved output — the deriver's per-section catch behaving as designed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def compute_finding_id(*, file: str, section_id: str, finding: str) -> str:
+    """Stable identity for one note across walks.
+
+    Composition: ``sha256(file + "::" + section_id + "::" + finding)``,
+    truncated to 16 hex chars. Two walks that emit the same finding
+    text from the same file targeting the same section produce the
+    same id; any change in any of the three components is a fresh id.
+
+    A reword of ``finding`` (even one character) gets a new id, which
+    semantically counts as "removed-and-added" from the surgical
+    aggregator's point of view — the same as a delete-plus-insert in a
+    text diff. That's intentional: a paragraph the cached body wrote
+    around the old wording can no longer be assumed to still be
+    grounded by the new wording.
+    """
+    from wikifi.fingerprint import hash_text
+
+    return hash_text(f"{file}::{section_id}::{finding}")
+


+    return [
+        compute_finding_id(
+            file=str(n.get("file", "")),
+            section_id=section_id,
+            finding=str(n.get("finding", "")),
+        )
+        for n in notes
+    ]
+
+


+    :func:`compute_finding_id`) that fed into the cached body, aligned
+    with the notes' position so a 1-based ``claim.source_indices``
+    entry maps to ``finding_ids[idx - 1]``. Surgical aggregation uses
+    this to detect which findings were added or removed between walks
+    and route an edit instead of a full rewrite.


+
+## Identity and Cache Rules
+
+**Finding identity** is defined as the SHA-256 digest of the concatenation `file::section_id::finding`, truncated to 16 hexadecimal characters. A single-character change to any of the three components produces a different identity, treated as a delete-plus-insert rather than an edit, invalidating any cached prose referencing the prior wording.


 # Diagrams

-The three diagrams below cover the subdomain structure of the core domain, the structural relationships between the principal entities, and the end-to-end pipeline integration flow.
-
---
-
-## Domain Map
-
-The four subdomains that compose the core domain of automated documentation synthesis form a strict directed dependency chain. No subdomain reaches backward; the ordering is a first-class design constraint.
-
-```mermaid
-graph LR
-    RI["Repository Introspection"]
-    KE["Knowledge Extraction"]
-    SS_P["Section Synthesis - Primary"]
-    SS_D["Section Synthesis - Derivative"]
-    AP_W[("Working-State Persistence")]
-    AP_C[("Committed Wiki Persistence")]
-
-    RI --> KE
-    KE --> AP_W
-    KE --> SS_P
-    SS_P --> SS_D
-    SS_P --> AP_C
-    SS_D --> AP_C
-```
-
---
-
-## Entity Relationship View
-
-The diagram spans six concern areas: wiki structure, evidence tracing, extraction, pipeline orchestration, caching, and quality review. The self-referential edge on **Section** represents the directed acyclic graph of upstream declarations enforced by topological ordering at startup.
-
-```mermaid
-erDiagram
-    SECTION ||--o{ SECTION : "upstream-of"
-    SECTION ||--o{ LOADED_SECTION : "has loaded form"
-    SECTION ||--o{ SECTION_REPORT : "reported by"
-    WIKI_REPORT ||--o{ SECTION_REPORT : aggregates
-    EVIDENCE_BUNDLE ||--o{ CLAIM : contains
-    EVIDENCE_BUNDLE ||--o{ CONTRADICTION : contains
-    CLAIM ||--o{ SOURCE_REF : "backed by"
-    CONTRADICTION ||--|{ CLAIM : groups
-    FILE_FINDINGS ||--|{ SECTION_FINDING : groups
-    WALK_REPORT ||--|| INTROSPECTION_RESULT : carries
-    WALK_REPORT ||--|| WALK_CACHE : carries
-    WALK_REPORT ||--|| REPO_GRAPH : carries
-    REPO_GRAPH ||--o{ GRAPH_NODE : contains
-    WALK_CACHE ||--o{ CACHED_FINDINGS : stores
-    WALK_CACHE ||--o{ CACHED_SECTION : stores
-    WIKI_QUALITY_REPORT ||--o{ CRITIQUE : maps
-    REVIEW_OUTCOME ||--|{ CRITIQUE : tracks
-    CHAT_SESSION ||--o{ CHAT_MESSAGE : history
-    CHAT_SESSION ||--|| LLM_PROVIDER : uses
-```
-
---
-
-## Pipeline Integration Flow
-
-The sequence below traces a complete pipeline run triggered by the `walk` command. The provider abstraction is the sole contact point for all inference calls; the filesystem layout abstraction mediates all reads and writes; the cache layer short-circuits calls when prior results remain valid. Derivative sections are excluded from the aggregation stage and are handled exclusively in the derivation stage.
-
-```mermaid
-sequenceDiagram
-    actor Operator
-    participant CLI
-    participant Orchestrator
-    participant LLMProvider as LLM Provider
-    participant ImportGraph as Import Graph
-    participant Cache
-    participant Layout as Filesystem Layout
-
-    Operator->>CLI: walk command
-    CLI->>Orchestrator: initialise pipeline
-
-    Note over Orchestrator,LLMProvider: Stage 1 - Repository Introspection
-    Orchestrator->>LLMProvider: structured completion (classify repository paths)
-    LLMProvider-->>Orchestrator: IntrospectionResult
-
-    Note over Orchestrator,ImportGraph: Build Cross-File Import Graph
-    Orchestrator->>ImportGraph: traverse in-scope files
-    ImportGraph-->>Orchestrator: RepoGraph ready
-
-    Note over Orchestrator,Layout: Stage 2 - Knowledge Extraction (per file)
-    loop each in-scope file
-        Orchestrator->>Cache: check content fingerprint
-        Cache-->>Orchestrator: hit or miss
-        Orchestrator->>ImportGraph: fetch neighbour paths
-        Orchestrator->>LLMProvider: structured completion (findings schema)
-        LLMProvider-->>Orchestrator: FileFindings
-        Orchestrator->>Layout: append notes
-    end
-
-    Note over Orchestrator,Layout: Stage 3 - Section Aggregation (primary sections only)
-    loop each primary section
-        Orchestrator->>Cache: check notes-payload hash
-        Cache-->>Orchestrator: hit or miss
-        Orchestrator->>LLMProvider: structured completion (section-body schema)
-        LLMProvider-->>Orchestrator: EvidenceBundle
-        Orchestrator->>Layout: write section markdown
-    end
-
-    Note over Orchestrator,Layout: Stage 4 - Section Derivation (topological order)
-    loop each derivative section
-        Orchestrator->>Layout: read upstream section bodies
-        Orchestrator->>LLMProvider: structured completion (synthesis)
-        LLMProvider-->>Orchestrator: section body
-        Orchestrator->>LLMProvider: structured completion (quality critique)
-        LLMProvider-->>Orchestrator: Critique
-        Orchestrator->>Layout: write section markdown
-    end
-
-    Orchestrator-->>CLI: WalkReport
-    CLI-->>Operator: run summary
-```
-
-> **External capability providers**: The system is additionally configured as a client that fans out to multiple external capability providers via a tool-server protocol — a local AI utility, a local web crawler, a remote documentation context service, and a remote stitching/search service. The upstream sections do not specify at which pipeline call sites these providers are invoked.
+_Derivation failed for **Diagrams** (1 validation error for DerivedSection
+  Invalid JSON: EOF while parsing a string at line 1 column 4334 [type=json_invalid, input_value='{"body": "Three diagrams... INTRO[Introspection]\\', input_type=str]
+    For further information visit https://errors.pydantic.dev/2.13/v/json_invalid). Upstream evidence preserved below._
+


+def _resolve_removed_claim_texts(cached: CachedSection, removed_ids: list[str]) -> list[dict]:
+    """Reconstruct (file, finding_text) pairs for every removed cached finding.
+
+    The cache stores ``finding_ids`` aligned with the prior walk's notes
+    order, but it doesn't store the original note bodies. The best we
+    can show the model for a removed finding is its file path — that's
+    enough context for the model to find the affected sentence in the
+    cached body and revise around it. When the cache only knows the id,
+    we surface the id itself so the prompt is at least debuggable.
+    """
+    out: list[dict] = []
+    for fid in removed_ids:
+        out.append({"finding_id": fid})
+    return out


+    collapses the decision tree back to the Plan A two-path version
+    (cache-hit or full rewrite).


+            "Preserves established prose and citation numbering. Disable to force the prior "
+            "(pre-Plan-B) behavior of full rewrites on any note change."


+        if self.total_live == 0:
+            return 0.0
+        return (len(self.added_indices) + len(self.removed_ids)) / self.total_live


Addresses every distinct concern raised by the AI review on PR #17. Comment IDs in parentheses; many were duplicated across the two review passes (pre-rebase + post-rebase). Identity / cache schema - Align finding_id length contract with the 12-char implementation. The Hard Specifications, the `compute_finding_id` docstring, and the Cache Entities table all said 16 hex chars but the function delegates to `fingerprint.hash_text` which truncates to FINGERPRINT_LENGTH=12. Updated the docs (cache.py, hard_specifications.md) to match what the code actually does — the wider hash isn't needed at typical scale and reusing one constant keeps the system coherent. (3183975867, 3184733330, 3184733407) - `note_finding_ids` now returns "" when a note is missing `file` or `finding`. Hashing empty inputs would otherwise produce a stable non-empty id and let two malformed notes from different walks compare equal as "unchanged", routing the section onto the cache/surgical paths despite having no real identity to anchor on. The classifier picks up the slack: any "" id in either set forces "rewrite" so two malformed notes can't accidentally cancel out via set equality. (3183975908, 3184733368) - Rewrote `CachedSection.finding_ids` docstring. The old text claimed the list aligned with `claim.source_indices`, but cached claims store fully-resolved SourceRefs, not indices. Documented the actual consumer: section-delta classification by the surgical classifier. (3184733384) Aggregator decision tree - Fixed Path 2's cache lie: stopped refreshing `notes_hash` to match live notes when the unchanged-finding-ids fast path fires. The cached claims still carry SourceRefs from the prior walk's notes (file, lines, fingerprint), and updating the cache key without refreshing those refs would let `aggregation_fully_cached` flag the section as fresh and let the orchestrator's stage-3 short-circuit lock the drifting citations in place. Path 2 still skips the LLM call this walk; the cache key only refreshes when a real Path 4 rewrite runs. (3183975957) - Clarified `use_surgical_edits` docs in both `aggregator.aggregate_all` and `Settings`. The flag only gates Path 3 (the LLM-side surgical edit); the no-LLM cache-reuse paths (full cache hit and unchanged- finding-ids re-render) still fire regardless. The old text claimed the flag collapsed the tree to Plan A's two paths, which never matched the actual code. (3183975934, 3184733492, 3184733517) Surgical helpers - Renamed `_resolve_removed_claim_texts` to `_removed_finding_descriptors` and rewrote the docstring to match the implementation: it builds the removed-notes payload as `{finding_id: ...}` dicts and does not use the `cached` parameter (the cache doesn't store original note bodies, only ids). The old name promised richer reconstruction. (3184733469) - `SectionChange.churn_ratio` now returns 1.0 when `total_live == 0` and there are removed ids, mirroring the classifier's `max(total_live, 1)` guard. The old 0.0 return for the remove-everything case would mislead any downstream caller checking `churn_ratio` directly. (3184733532) Wiki artifact - Restored `.wikifi/diagrams.md` to its pre-chore-commit version. The chore commit (refresh wiki via Plan B walk) committed an LLM JSON truncation as the diagrams body — `Derivation failed for **Diagrams** (... Invalid JSON: EOF while parsing ...)` rather than actual Mermaid. Forward-fix approach (revert just the broken file) rather than amending the chore commit; the next walk against this repo will regenerate it cleanly. (3183975985, 3184733450) Regression tests added - `note_finding_ids` returns "" for malformed notes (missing/blank file or finding, including null `finding`). - `classify_section_change` force-rewrites when any finding_id is empty in either set, even when the sets look identical. - `SectionChange.churn_ratio` is 1.0 for the remove-everything edge case and 0.0 for the truly-empty case. - Aggregator's Path 2 leaves the cache entry's `notes_hash` alone and `aggregation_fully_cached` correctly refuses to flag the section as fresh. Gate: 245 passed, 94% coverage, lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codeninja requested a review from Copilot May 4, 2026 19:24

Copilot started reviewing on behalf of codeninja May 4, 2026 19:25 View session

Copilot AI reviewed May 4, 2026

View reviewed changes

codeninja and others added 2 commits May 4, 2026 15:52

codeninja force-pushed the feat/plan-b-surgical-aggregation branch from 1aa150a to abbf25c Compare May 4, 2026 20:54

codeninja requested a review from Copilot May 4, 2026 21:43

Copilot started reviewing on behalf of codeninja May 4, 2026 21:44 View session

Copilot AI reviewed May 4, 2026

View reviewed changes

codeninja merged commit 58ad367 into feat/plan-a-cache-persistence May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aggregator): surgical edits when only some findings change (Plan B)#17

feat(aggregator): surgical edits when only some findings change (Plan B)#17
codeninja merged 3 commits into
feat/plan-a-cache-persistencefrom
feat/plan-b-surgical-aggregation

codeninja commented May 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		collapses the decision tree back to the Plan A two-path version
		(cache-hit or full rewrite).


		## Identity and Cache Rules

		Finding identity is defined as the SHA-256 digest of the concatenation `file::section_id::finding`, truncated to 16 hexadecimal characters. A single-character change to any of the three components produces a different identity, treated as a delete-plus-insert rather than an edit, invalidating any cached prose referencing the prior wording.

		"Preserves established prose and citation numbering. Disable to force the prior "
		"(pre-Plan-B) behavior of full rewrites on any note change."

Conversation

codeninja commented May 4, 2026

Summary

Decision tree

Identity model

What's new

Test plan

Risk register

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants