Skip to content

feat(aggregator): surgical edits when only some findings change (Plan B)#17

Merged
codeninja merged 3 commits into
feat/plan-a-cache-persistencefrom
feat/plan-b-surgical-aggregation
May 4, 2026
Merged

feat(aggregator): surgical edits when only some findings change (Plan B)#17
codeninja merged 3 commits into
feat/plan-a-cache-persistencefrom
feat/plan-b-surgical-aggregation

Conversation

@codeninja
Copy link
Copy Markdown
Owner

Summary

  • Stacked on feat(cache): incremental persistence + full-cache-hit short-circuit (Plan A) #16 (Plan A). Re-target to main after Plan A lands.
  • Adds a third aggregation path between full cache hit and full rewrite: when a section's finding set overlaps the cached set within a tunable threshold, the LLM is asked to edit the cached body in place around the added/removed delta.
  • Preserves established prose. Cuts tokens on partial-change walks. Addresses the user's "potentially omitting or changing key details" concern by making the cached body the explicit anchor instead of re-deriving from scratch.
  • Cache schema v2 → v3 (adds finding_ids to CachedSection).

Plan: docs/plan-b-surgical-aggregation.md

Decision tree

Per primary section, in priority order:

  1. Cache hitnotes_hash matches cached entry. Re-render cached bundle, no LLM call. (Plan A.)
  2. Unchanged finding idsnotes_hash differs but ids match (line range moved, summary changed). Refresh cache key, re-render cached body, no LLM call.
  3. Surgical — symmetric difference of finding ids ≤ surgical_edit_threshold (default 0.3). Send cached body + delta → LLM. Merge edit with cached claims (drop removed, add new resolved against added notes). Contradictions are fully replaced.
  4. Rewrite — too much churn or no prior cached body. Original from-scratch aggregation. (Plan A.)

Identity model

finding_id = sha256(file_path::section_id::finding_text)[:16]. Stable across walks; reword of finding text → new id (semantically delete-plus-insert, since prose written around the old wording can no longer be assumed grounded by the new wording).

CachedSection.finding_ids: list[str] is aligned with note order so a 1-based claim.source_indices entry maps to finding_ids[idx-1].

What's new

  • New module wikifi/surgical.pySectionChange, classify_section_change, SurgicalEdit Pydantic schema, SURGICAL_SYSTEM_PROMPT, surgical_aggregate (LLM call + claim re-anchoring).
  • wikifi/cache.pycompute_finding_id, note_finding_ids, CachedSection.finding_ids, schema bump.
  • wikifi/aggregator.py — decision tree with 4 paths; AggregationStats adds sections_edited + sections_rewritten.
  • wikifi/config.pySettings.use_surgical_edits + Settings.surgical_edit_threshold.
  • wikifi/cli.pywikifi walk --no-surgical opt-out flag; walk report shows new columns.
  • wikifi/orchestrator.py — passes the new settings through to aggregate_all.

Test plan

  • make test — 231 passed (was 212), 93% coverage
  • make lint — clean
  • New tests in tests/test_surgical.py:
    • compute_finding_id / note_finding_ids determinism
    • classifier truth table — no cache, legacy v2 (no finding_ids), identical sets, small delta → surgical, large delta → rewrite, threshold exactly equal → surgical (inclusive), disabled threshold → rewrite
    • surgical_aggregate — claim merge keeps cached minus removed, adds new resolved against added notes; bad indices ignored; contradictions replaced
    • aggregator records finding_ids on every cache write
    • unchanged-finding-ids path takes no LLM call and refreshes notes_hash
    • surgical path selected for small delta; rewrite for large delta
    • use_surgical_edits=False collapses to Plan A's two-path tree
    • surgical-path LLM failure falls back cleanly to rewrite (section is never empty)
    • stability: when LLM honors the contract, unchanged paragraphs are byte-preserved through the merge — guards the mechanism (no rendering pass mangles the unchanged region); real-world prose stability depends on prompt quality
  • Manual: make walk → edit one source file that contributes a finding to one section → make walk → verify only that section is edited (sections_edited=1) and the rest of the wiki is byte-stable

Risk register

  • The model may not actually preserve unchanged paragraphs in every real-world case. The mocked stability test guards the mechanism but not the prompt quality. Mitigations:
    • Tight prompt instructing verbatim preservation, with input body fenced in markdown
    • --no-surgical opt-out for users who'd rather have full rewrites
    • Threshold default 0.3 keeps surgical to genuinely-small deltas where preservation is the easy default
    • (Future) post-edit critic that diffs edited body against cached body and rejects edits whose drift exceeds the delta size
  • Cache v2 → v3 invalidates v2 entries (one-time re-aggregation on upgrade). Same upgrade pattern as Plan A.

🤖 Generated with Claude Code

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a “Plan B” aggregation mode that can re-use cached section prose when only a small subset of a section’s findings changed, by asking the LLM to edit the cached body in place and then merging the structured evidence (claims/contradictions) with the cached evidence.

Changes:

  • Introduces a new surgical-edit aggregation path (unchanged / surgical edit / rewrite classification) and wires it into the stage-3 aggregator.
  • Extends the aggregation cache schema to store stable per-note finding_ids and bumps cache version to v3.
  • Adds settings + CLI flag to enable/disable surgical edits and tunes reporting/tests accordingly.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
wikifi/surgical.py New surgical edit classifier + prompt/schema + merge logic for edited bodies and evidence.
wikifi/aggregator.py Implements the 4-path decision tree (cache hit / unchanged ids / surgical / rewrite) and records new stats + finding_ids in cache.
wikifi/cache.py Cache schema v3 (CachedSection.finding_ids), plus helpers to compute and align finding ids per note.
wikifi/config.py Adds use_surgical_edits and surgical_edit_threshold settings.
wikifi/cli.py Adds --no-surgical opt-out and extends walk output with surgical-related stats.
wikifi/orchestrator.py Threads surgical settings through to aggregate_all.
tests/test_surgical.py New unit tests covering finding id determinism, classifier routing, surgical merge behavior, and aggregator path selection.
.wikifi/*.md Regenerated wiki content reflecting the new behavior and updated narratives; diagrams.md currently includes a derivation error artifact.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread wikifi/cache.py
Comment on lines +550 to +567
def compute_finding_id(*, file: str, section_id: str, finding: str) -> str:
"""Stable identity for one note across walks.

Composition: ``sha256(file + "::" + section_id + "::" + finding)``,
truncated to 16 hex chars. Two walks that emit the same finding
text from the same file targeting the same section produce the
same id; any change in any of the three components is a fresh id.

A reword of ``finding`` (even one character) gets a new id, which
semantically counts as "removed-and-added" from the surgical
aggregator's point of view — the same as a delete-plus-insert in a
text diff. That's intentional: a paragraph the cached body wrote
around the old wording can no longer be assumed to still be
grounded by the new wording.
"""
from wikifi.fingerprint import hash_text

return hash_text(f"{file}::{section_id}::{finding}")
Comment thread wikifi/cache.py Outdated
Comment on lines +581 to +586
compute_finding_id(
file=str(n.get("file", "")),
section_id=section_id,
finding=str(n.get("finding", "")),
)
for n in notes
Comment thread wikifi/aggregator.py Outdated
Comment on lines +140 to +141
collapses the decision tree back to the Plan A two-path version
(cache-hit or full rewrite).
Comment thread wikifi/aggregator.py Outdated
Comment on lines +182 to +205
# Path 2: finding ids unchanged but notes_hash differs (e.g. line
# ranges shifted). The cached body is still grounded; just
# refresh the cache key and re-render.
if change.decision == "unchanged" and cached_entry is not None:
bundle = EvidenceBundle(
body=cached_entry.body,
claims=[Claim.model_validate(c) for c in cached_entry.claims],
contradictions=[Contradiction.model_validate(c) for c in cached_entry.contradictions],
)
bundle = _bundle_from(structured, notes)
rendered = render_section_body(bundle)
except Exception as exc:
log.warning("aggregation failed for %s: %s", section.id, exc)
rendered = _fallback_body(section, notes, error=str(exc))
bundle = None

write_section(layout, section, rendered)
write_section(layout, section, render_section_body(bundle))
stats.sections_cached += 1
stats.sections_written += 1
if cache is not None:
cache.record_aggregation(
section.id,
notes_hash=notes_hash,
body=cached_entry.body,
claims=cached_entry.claims,
contradictions=cached_entry.contradictions,
finding_ids=live_finding_ids,
)
if persist_cache is not None:
persist_cache()
continue
Comment thread .wikifi/diagrams.md Outdated
Comment on lines +1 to +8
# Diagrams

The three diagrams below cover the subdomain structure of the core domain, the structural relationships between the principal entities, and the end-to-end pipeline integration flow.

---

## Domain Map

The four subdomains that compose the core domain of automated documentation synthesis form a strict directed dependency chain. No subdomain reaches backward; the ordering is a first-class design constraint.

```mermaid
graph LR
RI["Repository Introspection"]
KE["Knowledge Extraction"]
SS_P["Section Synthesis - Primary"]
SS_D["Section Synthesis - Derivative"]
AP_W[("Working-State Persistence")]
AP_C[("Committed Wiki Persistence")]

RI --> KE
KE --> AP_W
KE --> SS_P
SS_P --> SS_D
SS_P --> AP_C
SS_D --> AP_C
```

---

## Entity Relationship View

The diagram spans six concern areas: wiki structure, evidence tracing, extraction, pipeline orchestration, caching, and quality review. The self-referential edge on **Section** represents the directed acyclic graph of upstream declarations enforced by topological ordering at startup.

```mermaid
erDiagram
SECTION ||--o{ SECTION : "upstream-of"
SECTION ||--o{ LOADED_SECTION : "has loaded form"
SECTION ||--o{ SECTION_REPORT : "reported by"
WIKI_REPORT ||--o{ SECTION_REPORT : aggregates
EVIDENCE_BUNDLE ||--o{ CLAIM : contains
EVIDENCE_BUNDLE ||--o{ CONTRADICTION : contains
CLAIM ||--o{ SOURCE_REF : "backed by"
CONTRADICTION ||--|{ CLAIM : groups
FILE_FINDINGS ||--|{ SECTION_FINDING : groups
WALK_REPORT ||--|| INTROSPECTION_RESULT : carries
WALK_REPORT ||--|| WALK_CACHE : carries
WALK_REPORT ||--|| REPO_GRAPH : carries
REPO_GRAPH ||--o{ GRAPH_NODE : contains
WALK_CACHE ||--o{ CACHED_FINDINGS : stores
WALK_CACHE ||--o{ CACHED_SECTION : stores
WIKI_QUALITY_REPORT ||--o{ CRITIQUE : maps
REVIEW_OUTCOME ||--|{ CRITIQUE : tracks
CHAT_SESSION ||--o{ CHAT_MESSAGE : history
CHAT_SESSION ||--|| LLM_PROVIDER : uses
```

---

## Pipeline Integration Flow

The sequence below traces a complete pipeline run triggered by the `walk` command. The provider abstraction is the sole contact point for all inference calls; the filesystem layout abstraction mediates all reads and writes; the cache layer short-circuits calls when prior results remain valid. Derivative sections are excluded from the aggregation stage and are handled exclusively in the derivation stage.

```mermaid
sequenceDiagram
actor Operator
participant CLI
participant Orchestrator
participant LLMProvider as LLM Provider
participant ImportGraph as Import Graph
participant Cache
participant Layout as Filesystem Layout

Operator->>CLI: walk command
CLI->>Orchestrator: initialise pipeline

Note over Orchestrator,LLMProvider: Stage 1 - Repository Introspection
Orchestrator->>LLMProvider: structured completion (classify repository paths)
LLMProvider-->>Orchestrator: IntrospectionResult

Note over Orchestrator,ImportGraph: Build Cross-File Import Graph
Orchestrator->>ImportGraph: traverse in-scope files
ImportGraph-->>Orchestrator: RepoGraph ready

Note over Orchestrator,Layout: Stage 2 - Knowledge Extraction (per file)
loop each in-scope file
Orchestrator->>Cache: check content fingerprint
Cache-->>Orchestrator: hit or miss
Orchestrator->>ImportGraph: fetch neighbour paths
Orchestrator->>LLMProvider: structured completion (findings schema)
LLMProvider-->>Orchestrator: FileFindings
Orchestrator->>Layout: append notes
end

Note over Orchestrator,Layout: Stage 3 - Section Aggregation (primary sections only)
loop each primary section
Orchestrator->>Cache: check notes-payload hash
Cache-->>Orchestrator: hit or miss
Orchestrator->>LLMProvider: structured completion (section-body schema)
LLMProvider-->>Orchestrator: EvidenceBundle
Orchestrator->>Layout: write section markdown
end

Note over Orchestrator,Layout: Stage 4 - Section Derivation (topological order)
loop each derivative section
Orchestrator->>Layout: read upstream section bodies
Orchestrator->>LLMProvider: structured completion (synthesis)
LLMProvider-->>Orchestrator: section body
Orchestrator->>LLMProvider: structured completion (quality critique)
LLMProvider-->>Orchestrator: Critique
Orchestrator->>Layout: write section markdown
end

Orchestrator-->>CLI: WalkReport
CLI-->>Operator: run summary
```

> **External capability providers**: The system is additionally configured as a client that fans out to multiple external capability providers via a tool-server protocol — a local AI utility, a local web crawler, a remote documentation context service, and a remote stitching/search service. The upstream sections do not specify at which pipeline call sites these providers are invoked.
_Derivation failed for **Diagrams** (1 validation error for DerivedSection
Invalid JSON: EOF while parsing a string at line 1 column 4334 [type=json_invalid, input_value='{"body": "Three diagrams... INTRO[Introspection]\\', input_type=str]
For further information visit https://errors.pydantic.dev/2.13/v/json_invalid). Upstream evidence preserved below._

> Brief: Mermaid diagrams that visualize structural and behavioral relationships across the system: a domain map (graph or classDiagram across domains), an entity relationship view (erDiagram across entities), and an integration flow (sequence or flowchart across integrations). Tech-agnostic — no reference to current stack.

codeninja and others added 2 commits May 4, 2026 15:52
Building on Plan A's cache layer, the aggregator gains a third path
between full cache hit and full rewrite. When a section's finding set
overlaps the cached set within a configurable threshold, the LLM is
asked to *edit* the cached body around the added/removed delta rather
than re-derive the section from scratch. Preserves established prose,
addresses the "potentially omitting or changing key details" concern,
and cuts tokens on partially-changed walks.

Per-section decision tree (in priority order):

1. **Cache hit** — notes_hash matches cached entry exactly. Re-render
   cached bundle, no LLM call. (Plan A behavior, unchanged.)
2. **Unchanged** — finding ids match but notes_hash differs (a source
   line shifted, summary changed, etc.). Refresh the cache key, re-
   render the cached body, no LLM call.
3. **Surgical** — symmetric difference of finding ids is at or below
   `surgical_edit_threshold` (default 0.3). Send cached body + delta
   to LLM, merge edit with cached claims (drop removed, add new).
4. **Rewrite** — too much churn (or no prior body to anchor on). Run
   the original from-scratch aggregation prompt. (Plan A behavior.)

Identity: each note gets a stable `finding_id =
sha256(file::section_id::finding_text)[:16]`. The cache stores
`finding_ids` aligned with note order so claim source_indices map
back to ids. Reword of finding text → new id (semantically a
delete-plus-insert, since prose written around the old wording can no
longer be assumed grounded).

Citation re-anchoring is mechanical: cached claims minus
`removed_claim_indices` plus `new_claims` resolved against the added
notes only. Contradictions are fully replaced from the model output
since they're a small set and cleaner to re-emit than diff.

New module `wikifi/surgical.py`:
- `compute_finding_id` / `note_finding_ids` (in cache.py — adjacent to
  the other hash helpers)
- `SectionChange` dataclass + `classify_section_change` for the diff
- `SurgicalEdit` Pydantic schema + `SURGICAL_SYSTEM_PROMPT`
- `surgical_aggregate` runs the LLM call and merges the result

Cache schema bumped v2 → v3 to add `finding_ids: list[str]` to
`CachedSection`. v2 entries load to empty so an upgraded wiki
re-aggregates on the first walk; subsequent walks light up surgical.

Configuration:
- `Settings.use_surgical_edits: bool = True`
- `Settings.surgical_edit_threshold: float = 0.3`
- `wikifi walk --no-surgical` opt-out flag

Walk report adds `sections_edited` and `sections_rewritten` columns
alongside the existing `sections_cached`.

Tests: 231 passing, 93% coverage. New `tests/test_surgical.py` covers:
- `compute_finding_id` / `note_finding_ids` determinism
- classifier truth table (no-cache, legacy v2, identical sets, small
  delta, large delta, threshold exactly equal, disabled threshold)
- `surgical_aggregate` claim merging (drop removed, resolve new
  against added notes, replace contradictions, ignore bad indices)
- aggregator decision tree records `finding_ids` on every cache write
- unchanged-finding-ids path takes no LLM call and refreshes notes_hash
- surgical path selected for small delta; rewrite for large delta
- `use_surgical_edits=False` collapses to Plan A's two-path tree
- surgical-path LLM failure falls back to rewrite cleanly
- stability: when LLM honors the contract, unchanged paragraphs are
  byte-preserved through the merge

Plan documented in `docs/plan-b-surgical-aggregation.md`. Stacked on
Plan A (#16).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First walk under cache schema v3 — the v2 cache from prior walks
invalidates, so every section took the rewrite path
(sections_rewritten=8, sections_edited=0). Subsequent walks with
small per-file changes will exercise the surgical edit path.

Walk stats:
- Stage 1 introspection: include=1 exclude=12 langs=Python
- Stage 2 extraction: 31 files seen, 29 contributed, 277 findings
- Stage 3 aggregation: 8 sections written (all rewrite path)
- Stage 4 derivation: personas + user_stories OK; diagrams hit a
  json_invalid response from the model (long mermaid output truncated
  by the SDK's parse layer) and fell back to upstream-bodies-preserved
  output — the deriver's per-section catch behaving as designed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codeninja codeninja force-pushed the feat/plan-b-surgical-aggregation branch from 1aa150a to abbf25c Compare May 4, 2026 20:54
@codeninja codeninja requested a review from Copilot May 4, 2026 21:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread wikifi/cache.py
Comment on lines +555 to +573
def compute_finding_id(*, file: str, section_id: str, finding: str) -> str:
"""Stable identity for one note across walks.

Composition: ``sha256(file + "::" + section_id + "::" + finding)``,
truncated to 16 hex chars. Two walks that emit the same finding
text from the same file targeting the same section produce the
same id; any change in any of the three components is a fresh id.

A reword of ``finding`` (even one character) gets a new id, which
semantically counts as "removed-and-added" from the surgical
aggregator's point of view — the same as a delete-plus-insert in a
text diff. That's intentional: a paragraph the cached body wrote
around the old wording can no longer be assumed to still be
grounded by the new wording.
"""
from wikifi.fingerprint import hash_text

return hash_text(f"{file}::{section_id}::{finding}")

Comment thread wikifi/cache.py Outdated
Comment on lines +585 to +594
return [
compute_finding_id(
file=str(n.get("file", "")),
section_id=section_id,
finding=str(n.get("finding", "")),
)
for n in notes
]


Comment thread wikifi/cache.py Outdated
Comment on lines +113 to +117
:func:`compute_finding_id`) that fed into the cached body, aligned
with the notes' position so a 1-based ``claim.source_indices``
entry maps to ``finding_ids[idx - 1]``. Surgical aggregation uses
this to detect which findings were added or removed between walks
and route an edit instead of a full rewrite.
Comment thread .wikifi/hard_specifications.md Outdated

## Identity and Cache Rules

**Finding identity** is defined as the SHA-256 digest of the concatenation `file::section_id::finding`, truncated to 16 hexadecimal characters. A single-character change to any of the three components produces a different identity, treated as a delete-plus-insert rather than an edit, invalidating any cached prose referencing the prior wording.
Comment thread .wikifi/diagrams.md Outdated
Comment on lines +1 to +6
# Diagrams

The three diagrams below cover the subdomain structure of the core domain, the structural relationships between the principal entities, and the end-to-end pipeline integration flow.

---

## Domain Map

The four subdomains that compose the core domain of automated documentation synthesis form a strict directed dependency chain. No subdomain reaches backward; the ordering is a first-class design constraint.

```mermaid
graph LR
RI["Repository Introspection"]
KE["Knowledge Extraction"]
SS_P["Section Synthesis - Primary"]
SS_D["Section Synthesis - Derivative"]
AP_W[("Working-State Persistence")]
AP_C[("Committed Wiki Persistence")]

RI --> KE
KE --> AP_W
KE --> SS_P
SS_P --> SS_D
SS_P --> AP_C
SS_D --> AP_C
```

---

## Entity Relationship View

The diagram spans six concern areas: wiki structure, evidence tracing, extraction, pipeline orchestration, caching, and quality review. The self-referential edge on **Section** represents the directed acyclic graph of upstream declarations enforced by topological ordering at startup.

```mermaid
erDiagram
SECTION ||--o{ SECTION : "upstream-of"
SECTION ||--o{ LOADED_SECTION : "has loaded form"
SECTION ||--o{ SECTION_REPORT : "reported by"
WIKI_REPORT ||--o{ SECTION_REPORT : aggregates
EVIDENCE_BUNDLE ||--o{ CLAIM : contains
EVIDENCE_BUNDLE ||--o{ CONTRADICTION : contains
CLAIM ||--o{ SOURCE_REF : "backed by"
CONTRADICTION ||--|{ CLAIM : groups
FILE_FINDINGS ||--|{ SECTION_FINDING : groups
WALK_REPORT ||--|| INTROSPECTION_RESULT : carries
WALK_REPORT ||--|| WALK_CACHE : carries
WALK_REPORT ||--|| REPO_GRAPH : carries
REPO_GRAPH ||--o{ GRAPH_NODE : contains
WALK_CACHE ||--o{ CACHED_FINDINGS : stores
WALK_CACHE ||--o{ CACHED_SECTION : stores
WIKI_QUALITY_REPORT ||--o{ CRITIQUE : maps
REVIEW_OUTCOME ||--|{ CRITIQUE : tracks
CHAT_SESSION ||--o{ CHAT_MESSAGE : history
CHAT_SESSION ||--|| LLM_PROVIDER : uses
```

---

## Pipeline Integration Flow

The sequence below traces a complete pipeline run triggered by the `walk` command. The provider abstraction is the sole contact point for all inference calls; the filesystem layout abstraction mediates all reads and writes; the cache layer short-circuits calls when prior results remain valid. Derivative sections are excluded from the aggregation stage and are handled exclusively in the derivation stage.

```mermaid
sequenceDiagram
actor Operator
participant CLI
participant Orchestrator
participant LLMProvider as LLM Provider
participant ImportGraph as Import Graph
participant Cache
participant Layout as Filesystem Layout

Operator->>CLI: walk command
CLI->>Orchestrator: initialise pipeline

Note over Orchestrator,LLMProvider: Stage 1 - Repository Introspection
Orchestrator->>LLMProvider: structured completion (classify repository paths)
LLMProvider-->>Orchestrator: IntrospectionResult

Note over Orchestrator,ImportGraph: Build Cross-File Import Graph
Orchestrator->>ImportGraph: traverse in-scope files
ImportGraph-->>Orchestrator: RepoGraph ready

Note over Orchestrator,Layout: Stage 2 - Knowledge Extraction (per file)
loop each in-scope file
Orchestrator->>Cache: check content fingerprint
Cache-->>Orchestrator: hit or miss
Orchestrator->>ImportGraph: fetch neighbour paths
Orchestrator->>LLMProvider: structured completion (findings schema)
LLMProvider-->>Orchestrator: FileFindings
Orchestrator->>Layout: append notes
end

Note over Orchestrator,Layout: Stage 3 - Section Aggregation (primary sections only)
loop each primary section
Orchestrator->>Cache: check notes-payload hash
Cache-->>Orchestrator: hit or miss
Orchestrator->>LLMProvider: structured completion (section-body schema)
LLMProvider-->>Orchestrator: EvidenceBundle
Orchestrator->>Layout: write section markdown
end

Note over Orchestrator,Layout: Stage 4 - Section Derivation (topological order)
loop each derivative section
Orchestrator->>Layout: read upstream section bodies
Orchestrator->>LLMProvider: structured completion (synthesis)
LLMProvider-->>Orchestrator: section body
Orchestrator->>LLMProvider: structured completion (quality critique)
LLMProvider-->>Orchestrator: Critique
Orchestrator->>Layout: write section markdown
end

Orchestrator-->>CLI: WalkReport
CLI-->>Operator: run summary
```

> **External capability providers**: The system is additionally configured as a client that fans out to multiple external capability providers via a tool-server protocol — a local AI utility, a local web crawler, a remote documentation context service, and a remote stitching/search service. The upstream sections do not specify at which pipeline call sites these providers are invoked.
_Derivation failed for **Diagrams** (1 validation error for DerivedSection
Invalid JSON: EOF while parsing a string at line 1 column 4334 [type=json_invalid, input_value='{"body": "Three diagrams... INTRO[Introspection]\\', input_type=str]
For further information visit https://errors.pydantic.dev/2.13/v/json_invalid). Upstream evidence preserved below._

Comment thread wikifi/surgical.py Outdated
Comment on lines +311 to +324
def _resolve_removed_claim_texts(cached: CachedSection, removed_ids: list[str]) -> list[dict]:
"""Reconstruct (file, finding_text) pairs for every removed cached finding.

The cache stores ``finding_ids`` aligned with the prior walk's notes
order, but it doesn't store the original note bodies. The best we
can show the model for a removed finding is its file path — that's
enough context for the model to find the affected sentence in the
cached body and revise around it. When the cache only knows the id,
we surface the id itself so the prompt is at least debuggable.
"""
out: list[dict] = []
for fid in removed_ids:
out.append({"finding_id": fid})
return out
Comment thread wikifi/aggregator.py Outdated
Comment on lines +140 to +141
collapses the decision tree back to the Plan A two-path version
(cache-hit or full rewrite).
Comment thread wikifi/config.py Outdated
Comment on lines +128 to +129
"Preserves established prose and citation numbering. Disable to force the prior "
"(pre-Plan-B) behavior of full rewrites on any note change."
Comment thread wikifi/surgical.py
Comment on lines +136 to +138
if self.total_live == 0:
return 0.0
return (len(self.added_indices) + len(self.removed_ids)) / self.total_live
Addresses every distinct concern raised by the AI review on PR #17.
Comment IDs in parentheses; many were duplicated across the two review
passes (pre-rebase + post-rebase).

Identity / cache schema
- Align finding_id length contract with the 12-char implementation. The
  Hard Specifications, the `compute_finding_id` docstring, and the
  Cache Entities table all said 16 hex chars but the function delegates
  to `fingerprint.hash_text` which truncates to FINGERPRINT_LENGTH=12.
  Updated the docs (cache.py, hard_specifications.md) to match what the
  code actually does — the wider hash isn't needed at typical scale and
  reusing one constant keeps the system coherent. (3183975867,
  3184733330, 3184733407)
- `note_finding_ids` now returns "" when a note is missing `file` or
  `finding`. Hashing empty inputs would otherwise produce a stable
  non-empty id and let two malformed notes from different walks compare
  equal as "unchanged", routing the section onto the cache/surgical
  paths despite having no real identity to anchor on. The classifier
  picks up the slack: any "" id in either set forces "rewrite" so two
  malformed notes can't accidentally cancel out via set equality.
  (3183975908, 3184733368)
- Rewrote `CachedSection.finding_ids` docstring. The old text claimed
  the list aligned with `claim.source_indices`, but cached claims store
  fully-resolved SourceRefs, not indices. Documented the actual
  consumer: section-delta classification by the surgical classifier.
  (3184733384)

Aggregator decision tree
- Fixed Path 2's cache lie: stopped refreshing `notes_hash` to match
  live notes when the unchanged-finding-ids fast path fires. The cached
  claims still carry SourceRefs from the prior walk's notes (file,
  lines, fingerprint), and updating the cache key without refreshing
  those refs would let `aggregation_fully_cached` flag the section as
  fresh and let the orchestrator's stage-3 short-circuit lock the
  drifting citations in place. Path 2 still skips the LLM call this
  walk; the cache key only refreshes when a real Path 4 rewrite runs.
  (3183975957)
- Clarified `use_surgical_edits` docs in both `aggregator.aggregate_all`
  and `Settings`. The flag only gates Path 3 (the LLM-side surgical
  edit); the no-LLM cache-reuse paths (full cache hit and unchanged-
  finding-ids re-render) still fire regardless. The old text claimed
  the flag collapsed the tree to Plan A's two paths, which never
  matched the actual code. (3183975934, 3184733492, 3184733517)

Surgical helpers
- Renamed `_resolve_removed_claim_texts` to `_removed_finding_descriptors`
  and rewrote the docstring to match the implementation: it builds the
  removed-notes payload as `{finding_id: ...}` dicts and does not use
  the `cached` parameter (the cache doesn't store original note bodies,
  only ids). The old name promised richer reconstruction. (3184733469)
- `SectionChange.churn_ratio` now returns 1.0 when `total_live == 0`
  and there are removed ids, mirroring the classifier's
  `max(total_live, 1)` guard. The old 0.0 return for the
  remove-everything case would mislead any downstream caller checking
  `churn_ratio` directly. (3184733532)

Wiki artifact
- Restored `.wikifi/diagrams.md` to its pre-chore-commit version. The
  chore commit (refresh wiki via Plan B walk) committed an LLM JSON
  truncation as the diagrams body — `Derivation failed for **Diagrams**
  (... Invalid JSON: EOF while parsing ...)` rather than actual
  Mermaid. Forward-fix approach (revert just the broken file) rather
  than amending the chore commit; the next walk against this repo will
  regenerate it cleanly. (3183975985, 3184733450)

Regression tests added
- `note_finding_ids` returns "" for malformed notes (missing/blank
  file or finding, including null `finding`).
- `classify_section_change` force-rewrites when any finding_id is
  empty in either set, even when the sets look identical.
- `SectionChange.churn_ratio` is 1.0 for the remove-everything edge
  case and 0.0 for the truly-empty case.
- Aggregator's Path 2 leaves the cache entry's `notes_hash` alone and
  `aggregation_fully_cached` correctly refuses to flag the section as
  fresh.

Gate: 245 passed, 94% coverage, lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codeninja codeninja merged commit 58ad367 into feat/plan-a-cache-persistence May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants