feat(aggregator): surgical edits when only some findings change (Plan B)#17
Merged
codeninja merged 3 commits intoMay 4, 2026
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a “Plan B” aggregation mode that can re-use cached section prose when only a small subset of a section’s findings changed, by asking the LLM to edit the cached body in place and then merging the structured evidence (claims/contradictions) with the cached evidence.
Changes:
- Introduces a new surgical-edit aggregation path (unchanged / surgical edit / rewrite classification) and wires it into the stage-3 aggregator.
- Extends the aggregation cache schema to store stable per-note
finding_idsand bumps cache version to v3. - Adds settings + CLI flag to enable/disable surgical edits and tunes reporting/tests accordingly.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
wikifi/surgical.py |
New surgical edit classifier + prompt/schema + merge logic for edited bodies and evidence. |
wikifi/aggregator.py |
Implements the 4-path decision tree (cache hit / unchanged ids / surgical / rewrite) and records new stats + finding_ids in cache. |
wikifi/cache.py |
Cache schema v3 (CachedSection.finding_ids), plus helpers to compute and align finding ids per note. |
wikifi/config.py |
Adds use_surgical_edits and surgical_edit_threshold settings. |
wikifi/cli.py |
Adds --no-surgical opt-out and extends walk output with surgical-related stats. |
wikifi/orchestrator.py |
Threads surgical settings through to aggregate_all. |
tests/test_surgical.py |
New unit tests covering finding id determinism, classifier routing, surgical merge behavior, and aggregator path selection. |
.wikifi/*.md |
Regenerated wiki content reflecting the new behavior and updated narratives; diagrams.md currently includes a derivation error artifact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+550
to
+567
| def compute_finding_id(*, file: str, section_id: str, finding: str) -> str: | ||
| """Stable identity for one note across walks. | ||
|
|
||
| Composition: ``sha256(file + "::" + section_id + "::" + finding)``, | ||
| truncated to 16 hex chars. Two walks that emit the same finding | ||
| text from the same file targeting the same section produce the | ||
| same id; any change in any of the three components is a fresh id. | ||
|
|
||
| A reword of ``finding`` (even one character) gets a new id, which | ||
| semantically counts as "removed-and-added" from the surgical | ||
| aggregator's point of view — the same as a delete-plus-insert in a | ||
| text diff. That's intentional: a paragraph the cached body wrote | ||
| around the old wording can no longer be assumed to still be | ||
| grounded by the new wording. | ||
| """ | ||
| from wikifi.fingerprint import hash_text | ||
|
|
||
| return hash_text(f"{file}::{section_id}::{finding}") |
Comment on lines
+581
to
+586
| compute_finding_id( | ||
| file=str(n.get("file", "")), | ||
| section_id=section_id, | ||
| finding=str(n.get("finding", "")), | ||
| ) | ||
| for n in notes |
Comment on lines
+140
to
+141
| collapses the decision tree back to the Plan A two-path version | ||
| (cache-hit or full rewrite). |
Comment on lines
+182
to
+205
| # Path 2: finding ids unchanged but notes_hash differs (e.g. line | ||
| # ranges shifted). The cached body is still grounded; just | ||
| # refresh the cache key and re-render. | ||
| if change.decision == "unchanged" and cached_entry is not None: | ||
| bundle = EvidenceBundle( | ||
| body=cached_entry.body, | ||
| claims=[Claim.model_validate(c) for c in cached_entry.claims], | ||
| contradictions=[Contradiction.model_validate(c) for c in cached_entry.contradictions], | ||
| ) | ||
| bundle = _bundle_from(structured, notes) | ||
| rendered = render_section_body(bundle) | ||
| except Exception as exc: | ||
| log.warning("aggregation failed for %s: %s", section.id, exc) | ||
| rendered = _fallback_body(section, notes, error=str(exc)) | ||
| bundle = None | ||
|
|
||
| write_section(layout, section, rendered) | ||
| write_section(layout, section, render_section_body(bundle)) | ||
| stats.sections_cached += 1 | ||
| stats.sections_written += 1 | ||
| if cache is not None: | ||
| cache.record_aggregation( | ||
| section.id, | ||
| notes_hash=notes_hash, | ||
| body=cached_entry.body, | ||
| claims=cached_entry.claims, | ||
| contradictions=cached_entry.contradictions, | ||
| finding_ids=live_finding_ids, | ||
| ) | ||
| if persist_cache is not None: | ||
| persist_cache() | ||
| continue |
Comment on lines
+1
to
+8
| # Diagrams | ||
|
|
||
| The three diagrams below cover the subdomain structure of the core domain, the structural relationships between the principal entities, and the end-to-end pipeline integration flow. | ||
|
|
||
| --- | ||
|
|
||
| ## Domain Map | ||
|
|
||
| The four subdomains that compose the core domain of automated documentation synthesis form a strict directed dependency chain. No subdomain reaches backward; the ordering is a first-class design constraint. | ||
|
|
||
| ```mermaid | ||
| graph LR | ||
| RI["Repository Introspection"] | ||
| KE["Knowledge Extraction"] | ||
| SS_P["Section Synthesis - Primary"] | ||
| SS_D["Section Synthesis - Derivative"] | ||
| AP_W[("Working-State Persistence")] | ||
| AP_C[("Committed Wiki Persistence")] | ||
|
|
||
| RI --> KE | ||
| KE --> AP_W | ||
| KE --> SS_P | ||
| SS_P --> SS_D | ||
| SS_P --> AP_C | ||
| SS_D --> AP_C | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Entity Relationship View | ||
|
|
||
| The diagram spans six concern areas: wiki structure, evidence tracing, extraction, pipeline orchestration, caching, and quality review. The self-referential edge on **Section** represents the directed acyclic graph of upstream declarations enforced by topological ordering at startup. | ||
|
|
||
| ```mermaid | ||
| erDiagram | ||
| SECTION ||--o{ SECTION : "upstream-of" | ||
| SECTION ||--o{ LOADED_SECTION : "has loaded form" | ||
| SECTION ||--o{ SECTION_REPORT : "reported by" | ||
| WIKI_REPORT ||--o{ SECTION_REPORT : aggregates | ||
| EVIDENCE_BUNDLE ||--o{ CLAIM : contains | ||
| EVIDENCE_BUNDLE ||--o{ CONTRADICTION : contains | ||
| CLAIM ||--o{ SOURCE_REF : "backed by" | ||
| CONTRADICTION ||--|{ CLAIM : groups | ||
| FILE_FINDINGS ||--|{ SECTION_FINDING : groups | ||
| WALK_REPORT ||--|| INTROSPECTION_RESULT : carries | ||
| WALK_REPORT ||--|| WALK_CACHE : carries | ||
| WALK_REPORT ||--|| REPO_GRAPH : carries | ||
| REPO_GRAPH ||--o{ GRAPH_NODE : contains | ||
| WALK_CACHE ||--o{ CACHED_FINDINGS : stores | ||
| WALK_CACHE ||--o{ CACHED_SECTION : stores | ||
| WIKI_QUALITY_REPORT ||--o{ CRITIQUE : maps | ||
| REVIEW_OUTCOME ||--|{ CRITIQUE : tracks | ||
| CHAT_SESSION ||--o{ CHAT_MESSAGE : history | ||
| CHAT_SESSION ||--|| LLM_PROVIDER : uses | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Pipeline Integration Flow | ||
|
|
||
| The sequence below traces a complete pipeline run triggered by the `walk` command. The provider abstraction is the sole contact point for all inference calls; the filesystem layout abstraction mediates all reads and writes; the cache layer short-circuits calls when prior results remain valid. Derivative sections are excluded from the aggregation stage and are handled exclusively in the derivation stage. | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| actor Operator | ||
| participant CLI | ||
| participant Orchestrator | ||
| participant LLMProvider as LLM Provider | ||
| participant ImportGraph as Import Graph | ||
| participant Cache | ||
| participant Layout as Filesystem Layout | ||
|
|
||
| Operator->>CLI: walk command | ||
| CLI->>Orchestrator: initialise pipeline | ||
|
|
||
| Note over Orchestrator,LLMProvider: Stage 1 - Repository Introspection | ||
| Orchestrator->>LLMProvider: structured completion (classify repository paths) | ||
| LLMProvider-->>Orchestrator: IntrospectionResult | ||
|
|
||
| Note over Orchestrator,ImportGraph: Build Cross-File Import Graph | ||
| Orchestrator->>ImportGraph: traverse in-scope files | ||
| ImportGraph-->>Orchestrator: RepoGraph ready | ||
|
|
||
| Note over Orchestrator,Layout: Stage 2 - Knowledge Extraction (per file) | ||
| loop each in-scope file | ||
| Orchestrator->>Cache: check content fingerprint | ||
| Cache-->>Orchestrator: hit or miss | ||
| Orchestrator->>ImportGraph: fetch neighbour paths | ||
| Orchestrator->>LLMProvider: structured completion (findings schema) | ||
| LLMProvider-->>Orchestrator: FileFindings | ||
| Orchestrator->>Layout: append notes | ||
| end | ||
|
|
||
| Note over Orchestrator,Layout: Stage 3 - Section Aggregation (primary sections only) | ||
| loop each primary section | ||
| Orchestrator->>Cache: check notes-payload hash | ||
| Cache-->>Orchestrator: hit or miss | ||
| Orchestrator->>LLMProvider: structured completion (section-body schema) | ||
| LLMProvider-->>Orchestrator: EvidenceBundle | ||
| Orchestrator->>Layout: write section markdown | ||
| end | ||
|
|
||
| Note over Orchestrator,Layout: Stage 4 - Section Derivation (topological order) | ||
| loop each derivative section | ||
| Orchestrator->>Layout: read upstream section bodies | ||
| Orchestrator->>LLMProvider: structured completion (synthesis) | ||
| LLMProvider-->>Orchestrator: section body | ||
| Orchestrator->>LLMProvider: structured completion (quality critique) | ||
| LLMProvider-->>Orchestrator: Critique | ||
| Orchestrator->>Layout: write section markdown | ||
| end | ||
|
|
||
| Orchestrator-->>CLI: WalkReport | ||
| CLI-->>Operator: run summary | ||
| ``` | ||
|
|
||
| > **External capability providers**: The system is additionally configured as a client that fans out to multiple external capability providers via a tool-server protocol — a local AI utility, a local web crawler, a remote documentation context service, and a remote stitching/search service. The upstream sections do not specify at which pipeline call sites these providers are invoked. | ||
| _Derivation failed for **Diagrams** (1 validation error for DerivedSection | ||
| Invalid JSON: EOF while parsing a string at line 1 column 4334 [type=json_invalid, input_value='{"body": "Three diagrams... INTRO[Introspection]\\', input_type=str] | ||
| For further information visit https://errors.pydantic.dev/2.13/v/json_invalid). Upstream evidence preserved below._ | ||
|
|
||
| > Brief: Mermaid diagrams that visualize structural and behavioral relationships across the system: a domain map (graph or classDiagram across domains), an entity relationship view (erDiagram across entities), and an integration flow (sequence or flowchart across integrations). Tech-agnostic — no reference to current stack. | ||
|
|
Building on Plan A's cache layer, the aggregator gains a third path between full cache hit and full rewrite. When a section's finding set overlaps the cached set within a configurable threshold, the LLM is asked to *edit* the cached body around the added/removed delta rather than re-derive the section from scratch. Preserves established prose, addresses the "potentially omitting or changing key details" concern, and cuts tokens on partially-changed walks. Per-section decision tree (in priority order): 1. **Cache hit** — notes_hash matches cached entry exactly. Re-render cached bundle, no LLM call. (Plan A behavior, unchanged.) 2. **Unchanged** — finding ids match but notes_hash differs (a source line shifted, summary changed, etc.). Refresh the cache key, re- render the cached body, no LLM call. 3. **Surgical** — symmetric difference of finding ids is at or below `surgical_edit_threshold` (default 0.3). Send cached body + delta to LLM, merge edit with cached claims (drop removed, add new). 4. **Rewrite** — too much churn (or no prior body to anchor on). Run the original from-scratch aggregation prompt. (Plan A behavior.) Identity: each note gets a stable `finding_id = sha256(file::section_id::finding_text)[:16]`. The cache stores `finding_ids` aligned with note order so claim source_indices map back to ids. Reword of finding text → new id (semantically a delete-plus-insert, since prose written around the old wording can no longer be assumed grounded). Citation re-anchoring is mechanical: cached claims minus `removed_claim_indices` plus `new_claims` resolved against the added notes only. Contradictions are fully replaced from the model output since they're a small set and cleaner to re-emit than diff. New module `wikifi/surgical.py`: - `compute_finding_id` / `note_finding_ids` (in cache.py — adjacent to the other hash helpers) - `SectionChange` dataclass + `classify_section_change` for the diff - `SurgicalEdit` Pydantic schema + `SURGICAL_SYSTEM_PROMPT` - `surgical_aggregate` runs the LLM call and merges the result Cache schema bumped v2 → v3 to add `finding_ids: list[str]` to `CachedSection`. v2 entries load to empty so an upgraded wiki re-aggregates on the first walk; subsequent walks light up surgical. Configuration: - `Settings.use_surgical_edits: bool = True` - `Settings.surgical_edit_threshold: float = 0.3` - `wikifi walk --no-surgical` opt-out flag Walk report adds `sections_edited` and `sections_rewritten` columns alongside the existing `sections_cached`. Tests: 231 passing, 93% coverage. New `tests/test_surgical.py` covers: - `compute_finding_id` / `note_finding_ids` determinism - classifier truth table (no-cache, legacy v2, identical sets, small delta, large delta, threshold exactly equal, disabled threshold) - `surgical_aggregate` claim merging (drop removed, resolve new against added notes, replace contradictions, ignore bad indices) - aggregator decision tree records `finding_ids` on every cache write - unchanged-finding-ids path takes no LLM call and refreshes notes_hash - surgical path selected for small delta; rewrite for large delta - `use_surgical_edits=False` collapses to Plan A's two-path tree - surgical-path LLM failure falls back to rewrite cleanly - stability: when LLM honors the contract, unchanged paragraphs are byte-preserved through the merge Plan documented in `docs/plan-b-surgical-aggregation.md`. Stacked on Plan A (#16). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First walk under cache schema v3 — the v2 cache from prior walks invalidates, so every section took the rewrite path (sections_rewritten=8, sections_edited=0). Subsequent walks with small per-file changes will exercise the surgical edit path. Walk stats: - Stage 1 introspection: include=1 exclude=12 langs=Python - Stage 2 extraction: 31 files seen, 29 contributed, 277 findings - Stage 3 aggregation: 8 sections written (all rewrite path) - Stage 4 derivation: personas + user_stories OK; diagrams hit a json_invalid response from the model (long mermaid output truncated by the SDK's parse layer) and fell back to upstream-bodies-preserved output — the deriver's per-section catch behaving as designed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1aa150a to
abbf25c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+555
to
+573
| def compute_finding_id(*, file: str, section_id: str, finding: str) -> str: | ||
| """Stable identity for one note across walks. | ||
|
|
||
| Composition: ``sha256(file + "::" + section_id + "::" + finding)``, | ||
| truncated to 16 hex chars. Two walks that emit the same finding | ||
| text from the same file targeting the same section produce the | ||
| same id; any change in any of the three components is a fresh id. | ||
|
|
||
| A reword of ``finding`` (even one character) gets a new id, which | ||
| semantically counts as "removed-and-added" from the surgical | ||
| aggregator's point of view — the same as a delete-plus-insert in a | ||
| text diff. That's intentional: a paragraph the cached body wrote | ||
| around the old wording can no longer be assumed to still be | ||
| grounded by the new wording. | ||
| """ | ||
| from wikifi.fingerprint import hash_text | ||
|
|
||
| return hash_text(f"{file}::{section_id}::{finding}") | ||
|
|
Comment on lines
+585
to
+594
| return [ | ||
| compute_finding_id( | ||
| file=str(n.get("file", "")), | ||
| section_id=section_id, | ||
| finding=str(n.get("finding", "")), | ||
| ) | ||
| for n in notes | ||
| ] | ||
|
|
||
|
|
Comment on lines
+113
to
+117
| :func:`compute_finding_id`) that fed into the cached body, aligned | ||
| with the notes' position so a 1-based ``claim.source_indices`` | ||
| entry maps to ``finding_ids[idx - 1]``. Surgical aggregation uses | ||
| this to detect which findings were added or removed between walks | ||
| and route an edit instead of a full rewrite. |
|
|
||
| ## Identity and Cache Rules | ||
|
|
||
| **Finding identity** is defined as the SHA-256 digest of the concatenation `file::section_id::finding`, truncated to 16 hexadecimal characters. A single-character change to any of the three components produces a different identity, treated as a delete-plus-insert rather than an edit, invalidating any cached prose referencing the prior wording. |
Comment on lines
+1
to
+6
| # Diagrams | ||
|
|
||
| The three diagrams below cover the subdomain structure of the core domain, the structural relationships between the principal entities, and the end-to-end pipeline integration flow. | ||
|
|
||
| --- | ||
|
|
||
| ## Domain Map | ||
|
|
||
| The four subdomains that compose the core domain of automated documentation synthesis form a strict directed dependency chain. No subdomain reaches backward; the ordering is a first-class design constraint. | ||
|
|
||
| ```mermaid | ||
| graph LR | ||
| RI["Repository Introspection"] | ||
| KE["Knowledge Extraction"] | ||
| SS_P["Section Synthesis - Primary"] | ||
| SS_D["Section Synthesis - Derivative"] | ||
| AP_W[("Working-State Persistence")] | ||
| AP_C[("Committed Wiki Persistence")] | ||
|
|
||
| RI --> KE | ||
| KE --> AP_W | ||
| KE --> SS_P | ||
| SS_P --> SS_D | ||
| SS_P --> AP_C | ||
| SS_D --> AP_C | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Entity Relationship View | ||
|
|
||
| The diagram spans six concern areas: wiki structure, evidence tracing, extraction, pipeline orchestration, caching, and quality review. The self-referential edge on **Section** represents the directed acyclic graph of upstream declarations enforced by topological ordering at startup. | ||
|
|
||
| ```mermaid | ||
| erDiagram | ||
| SECTION ||--o{ SECTION : "upstream-of" | ||
| SECTION ||--o{ LOADED_SECTION : "has loaded form" | ||
| SECTION ||--o{ SECTION_REPORT : "reported by" | ||
| WIKI_REPORT ||--o{ SECTION_REPORT : aggregates | ||
| EVIDENCE_BUNDLE ||--o{ CLAIM : contains | ||
| EVIDENCE_BUNDLE ||--o{ CONTRADICTION : contains | ||
| CLAIM ||--o{ SOURCE_REF : "backed by" | ||
| CONTRADICTION ||--|{ CLAIM : groups | ||
| FILE_FINDINGS ||--|{ SECTION_FINDING : groups | ||
| WALK_REPORT ||--|| INTROSPECTION_RESULT : carries | ||
| WALK_REPORT ||--|| WALK_CACHE : carries | ||
| WALK_REPORT ||--|| REPO_GRAPH : carries | ||
| REPO_GRAPH ||--o{ GRAPH_NODE : contains | ||
| WALK_CACHE ||--o{ CACHED_FINDINGS : stores | ||
| WALK_CACHE ||--o{ CACHED_SECTION : stores | ||
| WIKI_QUALITY_REPORT ||--o{ CRITIQUE : maps | ||
| REVIEW_OUTCOME ||--|{ CRITIQUE : tracks | ||
| CHAT_SESSION ||--o{ CHAT_MESSAGE : history | ||
| CHAT_SESSION ||--|| LLM_PROVIDER : uses | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Pipeline Integration Flow | ||
|
|
||
| The sequence below traces a complete pipeline run triggered by the `walk` command. The provider abstraction is the sole contact point for all inference calls; the filesystem layout abstraction mediates all reads and writes; the cache layer short-circuits calls when prior results remain valid. Derivative sections are excluded from the aggregation stage and are handled exclusively in the derivation stage. | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| actor Operator | ||
| participant CLI | ||
| participant Orchestrator | ||
| participant LLMProvider as LLM Provider | ||
| participant ImportGraph as Import Graph | ||
| participant Cache | ||
| participant Layout as Filesystem Layout | ||
|
|
||
| Operator->>CLI: walk command | ||
| CLI->>Orchestrator: initialise pipeline | ||
|
|
||
| Note over Orchestrator,LLMProvider: Stage 1 - Repository Introspection | ||
| Orchestrator->>LLMProvider: structured completion (classify repository paths) | ||
| LLMProvider-->>Orchestrator: IntrospectionResult | ||
|
|
||
| Note over Orchestrator,ImportGraph: Build Cross-File Import Graph | ||
| Orchestrator->>ImportGraph: traverse in-scope files | ||
| ImportGraph-->>Orchestrator: RepoGraph ready | ||
|
|
||
| Note over Orchestrator,Layout: Stage 2 - Knowledge Extraction (per file) | ||
| loop each in-scope file | ||
| Orchestrator->>Cache: check content fingerprint | ||
| Cache-->>Orchestrator: hit or miss | ||
| Orchestrator->>ImportGraph: fetch neighbour paths | ||
| Orchestrator->>LLMProvider: structured completion (findings schema) | ||
| LLMProvider-->>Orchestrator: FileFindings | ||
| Orchestrator->>Layout: append notes | ||
| end | ||
|
|
||
| Note over Orchestrator,Layout: Stage 3 - Section Aggregation (primary sections only) | ||
| loop each primary section | ||
| Orchestrator->>Cache: check notes-payload hash | ||
| Cache-->>Orchestrator: hit or miss | ||
| Orchestrator->>LLMProvider: structured completion (section-body schema) | ||
| LLMProvider-->>Orchestrator: EvidenceBundle | ||
| Orchestrator->>Layout: write section markdown | ||
| end | ||
|
|
||
| Note over Orchestrator,Layout: Stage 4 - Section Derivation (topological order) | ||
| loop each derivative section | ||
| Orchestrator->>Layout: read upstream section bodies | ||
| Orchestrator->>LLMProvider: structured completion (synthesis) | ||
| LLMProvider-->>Orchestrator: section body | ||
| Orchestrator->>LLMProvider: structured completion (quality critique) | ||
| LLMProvider-->>Orchestrator: Critique | ||
| Orchestrator->>Layout: write section markdown | ||
| end | ||
|
|
||
| Orchestrator-->>CLI: WalkReport | ||
| CLI-->>Operator: run summary | ||
| ``` | ||
|
|
||
| > **External capability providers**: The system is additionally configured as a client that fans out to multiple external capability providers via a tool-server protocol — a local AI utility, a local web crawler, a remote documentation context service, and a remote stitching/search service. The upstream sections do not specify at which pipeline call sites these providers are invoked. | ||
| _Derivation failed for **Diagrams** (1 validation error for DerivedSection | ||
| Invalid JSON: EOF while parsing a string at line 1 column 4334 [type=json_invalid, input_value='{"body": "Three diagrams... INTRO[Introspection]\\', input_type=str] | ||
| For further information visit https://errors.pydantic.dev/2.13/v/json_invalid). Upstream evidence preserved below._ | ||
|
|
Comment on lines
+311
to
+324
| def _resolve_removed_claim_texts(cached: CachedSection, removed_ids: list[str]) -> list[dict]: | ||
| """Reconstruct (file, finding_text) pairs for every removed cached finding. | ||
|
|
||
| The cache stores ``finding_ids`` aligned with the prior walk's notes | ||
| order, but it doesn't store the original note bodies. The best we | ||
| can show the model for a removed finding is its file path — that's | ||
| enough context for the model to find the affected sentence in the | ||
| cached body and revise around it. When the cache only knows the id, | ||
| we surface the id itself so the prompt is at least debuggable. | ||
| """ | ||
| out: list[dict] = [] | ||
| for fid in removed_ids: | ||
| out.append({"finding_id": fid}) | ||
| return out |
Comment on lines
+140
to
+141
| collapses the decision tree back to the Plan A two-path version | ||
| (cache-hit or full rewrite). |
Comment on lines
+128
to
+129
| "Preserves established prose and citation numbering. Disable to force the prior " | ||
| "(pre-Plan-B) behavior of full rewrites on any note change." |
Comment on lines
+136
to
+138
| if self.total_live == 0: | ||
| return 0.0 | ||
| return (len(self.added_indices) + len(self.removed_ids)) / self.total_live |
Addresses every distinct concern raised by the AI review on PR #17. Comment IDs in parentheses; many were duplicated across the two review passes (pre-rebase + post-rebase). Identity / cache schema - Align finding_id length contract with the 12-char implementation. The Hard Specifications, the `compute_finding_id` docstring, and the Cache Entities table all said 16 hex chars but the function delegates to `fingerprint.hash_text` which truncates to FINGERPRINT_LENGTH=12. Updated the docs (cache.py, hard_specifications.md) to match what the code actually does — the wider hash isn't needed at typical scale and reusing one constant keeps the system coherent. (3183975867, 3184733330, 3184733407) - `note_finding_ids` now returns "" when a note is missing `file` or `finding`. Hashing empty inputs would otherwise produce a stable non-empty id and let two malformed notes from different walks compare equal as "unchanged", routing the section onto the cache/surgical paths despite having no real identity to anchor on. The classifier picks up the slack: any "" id in either set forces "rewrite" so two malformed notes can't accidentally cancel out via set equality. (3183975908, 3184733368) - Rewrote `CachedSection.finding_ids` docstring. The old text claimed the list aligned with `claim.source_indices`, but cached claims store fully-resolved SourceRefs, not indices. Documented the actual consumer: section-delta classification by the surgical classifier. (3184733384) Aggregator decision tree - Fixed Path 2's cache lie: stopped refreshing `notes_hash` to match live notes when the unchanged-finding-ids fast path fires. The cached claims still carry SourceRefs from the prior walk's notes (file, lines, fingerprint), and updating the cache key without refreshing those refs would let `aggregation_fully_cached` flag the section as fresh and let the orchestrator's stage-3 short-circuit lock the drifting citations in place. Path 2 still skips the LLM call this walk; the cache key only refreshes when a real Path 4 rewrite runs. (3183975957) - Clarified `use_surgical_edits` docs in both `aggregator.aggregate_all` and `Settings`. The flag only gates Path 3 (the LLM-side surgical edit); the no-LLM cache-reuse paths (full cache hit and unchanged- finding-ids re-render) still fire regardless. The old text claimed the flag collapsed the tree to Plan A's two paths, which never matched the actual code. (3183975934, 3184733492, 3184733517) Surgical helpers - Renamed `_resolve_removed_claim_texts` to `_removed_finding_descriptors` and rewrote the docstring to match the implementation: it builds the removed-notes payload as `{finding_id: ...}` dicts and does not use the `cached` parameter (the cache doesn't store original note bodies, only ids). The old name promised richer reconstruction. (3184733469) - `SectionChange.churn_ratio` now returns 1.0 when `total_live == 0` and there are removed ids, mirroring the classifier's `max(total_live, 1)` guard. The old 0.0 return for the remove-everything case would mislead any downstream caller checking `churn_ratio` directly. (3184733532) Wiki artifact - Restored `.wikifi/diagrams.md` to its pre-chore-commit version. The chore commit (refresh wiki via Plan B walk) committed an LLM JSON truncation as the diagrams body — `Derivation failed for **Diagrams** (... Invalid JSON: EOF while parsing ...)` rather than actual Mermaid. Forward-fix approach (revert just the broken file) rather than amending the chore commit; the next walk against this repo will regenerate it cleanly. (3183975985, 3184733450) Regression tests added - `note_finding_ids` returns "" for malformed notes (missing/blank file or finding, including null `finding`). - `classify_section_change` force-rewrites when any finding_id is empty in either set, even when the sets look identical. - `SectionChange.churn_ratio` is 1.0 for the remove-everything edge case and 0.0 for the truly-empty case. - Aggregator's Path 2 leaves the cache entry's `notes_hash` alone and `aggregation_fully_cached` correctly refuses to flag the section as fresh. Gate: 245 passed, 94% coverage, lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mainafter Plan A lands.finding_idstoCachedSection).Plan: docs/plan-b-surgical-aggregation.md
Decision tree
Per primary section, in priority order:
notes_hashmatches cached entry. Re-render cached bundle, no LLM call. (Plan A.)notes_hashdiffers but ids match (line range moved, summary changed). Refresh cache key, re-render cached body, no LLM call.surgical_edit_threshold(default 0.3). Send cached body + delta → LLM. Merge edit with cached claims (drop removed, add new resolved against added notes). Contradictions are fully replaced.Identity model
finding_id = sha256(file_path::section_id::finding_text)[:16]. Stable across walks; reword of finding text → new id (semantically delete-plus-insert, since prose written around the old wording can no longer be assumed grounded by the new wording).CachedSection.finding_ids: list[str]is aligned with note order so a 1-basedclaim.source_indicesentry maps tofinding_ids[idx-1].What's new
wikifi/surgical.py—SectionChange,classify_section_change,SurgicalEditPydantic schema,SURGICAL_SYSTEM_PROMPT,surgical_aggregate(LLM call + claim re-anchoring).wikifi/cache.py—compute_finding_id,note_finding_ids,CachedSection.finding_ids, schema bump.wikifi/aggregator.py— decision tree with 4 paths;AggregationStatsaddssections_edited+sections_rewritten.wikifi/config.py—Settings.use_surgical_edits+Settings.surgical_edit_threshold.wikifi/cli.py—wikifi walk --no-surgicalopt-out flag; walk report shows new columns.wikifi/orchestrator.py— passes the new settings through toaggregate_all.Test plan
make test— 231 passed (was 212), 93% coveragemake lint— cleantests/test_surgical.py:compute_finding_id/note_finding_idsdeterminismsurgical_aggregate— claim merge keeps cached minus removed, adds new resolved against added notes; bad indices ignored; contradictions replacedfinding_idson every cache writeuse_surgical_edits=Falsecollapses to Plan A's two-path treemake walk→ edit one source file that contributes a finding to one section →make walk→ verify only that section is edited (sections_edited=1) and the rest of the wiki is byte-stableRisk register
--no-surgicalopt-out for users who'd rather have full rewrites🤖 Generated with Claude Code