Plan 195: Reduce per-Check allocations across 6 rules#371
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR implements Plan 195’s allocation-reduction work by refactoring several rules and shared lint infrastructure to avoid per-Check regex work, reduce lazy-init closure allocations, and add/adjust allocation-budget gates.
Changes:
- Refactors multiple rules (e.g., MDS035, MDS053, MDS054, MDS027, MDS063, MDS036, MDS029) to reduce per-
Checkallocations via byte scanning, lazy building, and shared cached state. - Replaces several
sync.Once+closure lazy inits withatomic.Bool+sync.Mutexpatterns inlint.Fileand code-block line collection to avoid closure allocations. - Adds per-rule alloc budget unit tests (skipping
-race/-short) and tightens the integration alloc-budget grandfather baselines.
Reviewed changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| plan/195_per-rule-alloc-budget.md | Updates plan task statuses and records allocation results/approach. |
| internal/rules/tocdirective/rule.go | Removes per-file re-parse and uses f.LinkReferences() to detect TOC ref. |
| internal/rules/tocdirective/rule_test.go | Updates helper test to use lint.NewFile + new helper name. |
| internal/rules/tocdirective/alloc_test.go | Adds unit alloc-budget gate for MDS035. |
| internal/rules/tocdirective/race_on_test.go | Build-tag constant for -race skipping. |
| internal/rules/tocdirective/race_off_test.go | Build-tag constant for non--race runs. |
| internal/rules/nounusedlinkdefinitions/rule.go | Replaces regex scan with byte scanning; reduces per-check allocations; lazy maps. |
| internal/rules/nounusedlinkdefinitions/alloc_test.go | Adds unit alloc-budget gate for MDS053 (partial ceiling). |
| internal/rules/nounusedlinkdefinitions/race_on_test.go | Build-tag constant for -race skipping. |
| internal/rules/nounusedlinkdefinitions/race_off_test.go | Build-tag constant for non--race runs. |
| internal/rules/noundefinedreferencelabels/rule.go | Replaces regex ref scanning with bracket byte scanner + recursive AST descent helpers. |
| internal/rules/maxsectionlength/rule.go | Adds early-return fast path when all limits are unset; avoids paragraph index when unused. |
| internal/rules/maxsectionlength/alloc_test.go | Adds unit alloc-budget gate + “no work” behaviour test for no-knobs path. |
| internal/rules/maxsectionlength/race_on_test.go | Build-tag constant for -race skipping. |
| internal/rules/maxsectionlength/race_off_test.go | Build-tag constant for non--race runs. |
| internal/rules/descriptivelinktext/rule.go | Moves banned-set cache onto rule instance with atomic-pointer + mutex. |
| internal/rules/descriptivelinktext/rule_test.go | Updates cache contract test to reflect per-rule memoization + invalidation. |
| internal/rules/descriptivelinktext/alloc_test.go | Adds unit alloc-budget gate for MDS063. |
| internal/rules/descriptivelinktext/race_on_test.go | Build-tag constant for -race skipping. |
| internal/rules/descriptivelinktext/race_off_test.go | Build-tag constant for non--race runs. |
| internal/rules/crossfilereferenceintegrity/rule.go | Introduces per-check context + lazy anchors/cache + existence-only path for non-fragment links. |
| internal/rules/crossfilereferenceintegrity/fscache.go | Adds cached filepath.Abs memoization for resolved roots. |
| internal/rules/crossfilereferenceintegrity/alloc_test.go | Adds unit alloc-budget gate for MDS027. |
| internal/rules/crossfilereferenceintegrity/race_on_test.go | Build-tag constant for -race skipping. |
| internal/rules/crossfilereferenceintegrity/race_off_test.go | Build-tag constant for non--race runs. |
| internal/rules/concisenessscoring/rule.go | Adds word-count short-circuit before running the classifier. |
| internal/rules/concisenessscoring/alloc_test.go | Adds unit alloc-budget gate for MDS029. |
| internal/rules/concisenessscoring/race_on_test.go | Build-tag constant for -race skipping. |
| internal/rules/concisenessscoring/race_off_test.go | Build-tag constant for non--race runs. |
| internal/lint/file.go | Replaces sync.Once with atomic.Bool+sync.Mutex for several per-file memos. |
| internal/lint/codeblocks.go | Reworks code/PI block line collection to avoid ast.Walk closures and once.Do closures. |
| internal/integration/alloc_budget_test.go | Tightens grandfathered baselines and removes rules now under the ceiling. |
| docs/development/index.md | Updates allocation budget docs to point to the alloc gate test. |
| CLAUDE.md | Updates allocation budget docs to point to the alloc gate test. |
| AGENTS.md | Updates allocation budget docs to point to the alloc gate test. |
| .github/copilot-instructions.md | Updates allocation budget docs to point to the alloc gate test. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 40 out of 40 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
.github/copilot-instructions.md:249
- The alloc-gate link target is incorrect for
.github/copilot-instructions.md:../../internal/...goes up two directories from.github/and ends up outside the repo. Use a path that’s correct relative to.github/(e.g.,../internal/integration/alloc_budget_test.go).
allocate 0–6.
[allocgate]: ../../internal/integration/alloc_budget_test.go
- Walk `f.Lines` / `f.AST` directly.
- Prefer `bytes.IndexByte` / `bytes.Contains` over
`regexp` for fixed searches.
- Compile every `regexp.Regexp` at package scope.
- Pre-size slices with `make([]X, 0, n)`.
Copilot review on PR #371 flagged five issues; this commit addresses four of them. * MDS029 word-count gate (concisenessscoring/rule.go): the mdtext.CountWords gate diverges from the classifier's [a-z0-9']+ tokenization (the whitespace-only splitter counts `hello-world` as one token while the classifier sees two), so it could under-count and skip paragraphs the classifier would have flagged. Replace with a hand-rolled countClassifierTokens byte scan that matches the classifier's regex byte-for-byte. Still zero-alloc; behaviour is now conservative — the gate never skips a paragraph the classifier would have processed. * MDS035 hasTOCRef per-paragraph rescan (tocdirective/rule.go): hasTOCRef was called from CheckNode on every paragraph, so a large doc would rescan f.LinkReferences() N times. Defer the call until we actually find a TOC directive candidate (matchVariant returned ok=true) — most paragraphs have no candidate, so the call rarely fires; per-doc cost is now O(candidates) instead of O(paragraphs). * MDS053 labelInRefs comment (nounusedlinkdefinitions/rule.go): the old comment claimed Go's compiler elides the `string(b)` alloc in `s == string(b)`, but that optimisation only applies to the `m[string(b)]` lookup and `string(b) == "literal"` forms (the right side must be a literal). Replace the comparison with an open-coded stringEqualsBytes helper that is guaranteed alloc-free, and update the comment to describe what actually happens. * allocgate link target (docs/development/index.md): the `[allocgate]: ../../internal/integration/alloc_budget_test.go` reference resolves correctly from docs/development/index.md but breaks when mdsmith's <?include?> directive copies the content verbatim into root-level files (CLAUDE.md, AGENTS.md) and into .github/copilot-instructions.md, where the `../..` path escapes the repo. Drop the link and use inline code instead; the surrounding sentence already names the file unambiguously, and the gate is a short `grep` away from any contributor reading the section. The fifth review comment (the stray `+` artifact in plan/195_per-rule-alloc-budget.md) is left for a follow-up: the artifact was masking ~8 pre-existing readability warnings on unrelated tasks by accidentally splitting the numbered list, and removing it (without rewording every affected task) would trigger MDS023 across the file. A separate plan-rework pass will address both the artifact and the surrounding readability. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Plan 195's task list had a literal `+` on its own line in task 9 (a diff artifact from an earlier conflict resolution). The `+` acted as a Markdown list marker, splitting the surrounding numbered list into multiple sub-lists. The split was hiding MDS023 / MDS024 readability warnings on tasks 1, 5, 6, 7, 9, 13, and 15 — those tasks had grown long sentences during the plan's evolution but goldmark's loose/tight list semantics treated them as inline text rather than paragraphs, so the readability rules skipped them. Remove the `+` and reword the affected tasks to fit MDS023's readability index (≤ 14.0 on the LIX / FRES scoring mdsmith uses) and MDS024's per-paragraph sentence cap (≤ 6). Content unchanged otherwise — every fix description still names the same files, the same allocator wins, and the same deferred follow-ups. Resolves PR #371 review thread (×3 duplicates) about the stray `+` artifact. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Copilot review on PR #371 flagged five issues; this commit addresses four of them. * MDS029 word-count gate (concisenessscoring/rule.go): the mdtext.CountWords gate diverges from the classifier's [a-z0-9']+ tokenization (the whitespace-only splitter counts `hello-world` as one token while the classifier sees two), so it could under-count and skip paragraphs the classifier would have flagged. Replace with a hand-rolled countClassifierTokens byte scan that matches the classifier's regex byte-for-byte. Still zero-alloc; behaviour is now conservative — the gate never skips a paragraph the classifier would have processed. * MDS035 hasTOCRef per-paragraph rescan (tocdirective/rule.go): hasTOCRef was called from CheckNode on every paragraph, so a large doc would rescan f.LinkReferences() N times. Defer the call until we actually find a TOC directive candidate (matchVariant returned ok=true) — most paragraphs have no candidate, so the call rarely fires; per-doc cost is now O(candidates) instead of O(paragraphs). * MDS053 labelInRefs comment (nounusedlinkdefinitions/rule.go): the old comment claimed Go's compiler elides the `string(b)` alloc in `s == string(b)`, but that optimisation only applies to the `m[string(b)]` lookup and `string(b) == "literal"` forms (the right side must be a literal). Replace the comparison with an open-coded stringEqualsBytes helper that is guaranteed alloc-free, and update the comment to describe what actually happens. * allocgate link target (docs/development/index.md): the `[allocgate]: ../../internal/integration/alloc_budget_test.go` reference resolves correctly from docs/development/index.md but breaks when mdsmith's <?include?> directive copies the content verbatim into root-level files (CLAUDE.md, AGENTS.md) and into .github/copilot-instructions.md, where the `../..` path escapes the repo. Drop the link and use inline code instead; the surrounding sentence already names the file unambiguously, and the gate is a short `grep` away from any contributor reading the section. The fifth review comment (the stray `+` artifact in plan/195_per-rule-alloc-budget.md) is left for a follow-up: the artifact was masking ~8 pre-existing readability warnings on unrelated tasks by accidentally splitting the numbered list, and removing it (without rewording every affected task) would trigger MDS023 across the file. A separate plan-rework pass will address both the artifact and the surrounding readability. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Plan 195's task list had a literal `+` on its own line in task 9 (a diff artifact from an earlier conflict resolution). The `+` acted as a Markdown list marker, splitting the surrounding numbered list into multiple sub-lists. The split was hiding MDS023 / MDS024 readability warnings on tasks 1, 5, 6, 7, 9, 13, and 15 — those tasks had grown long sentences during the plan's evolution but goldmark's loose/tight list semantics treated them as inline text rather than paragraphs, so the readability rules skipped them. Remove the `+` and reword the affected tasks to fit MDS023's readability index (≤ 14.0 on the LIX / FRES scoring mdsmith uses) and MDS024's per-paragraph sentence cap (≤ 6). Content unchanged otherwise — every fix description still names the same files, the same allocator wins, and the same deferred follow-ups. Resolves PR #371 review thread (×3 duplicates) about the stray `+` artifact. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
1f1ed5f to
838dc1b
Compare
Copilot review on the rebased branch flagged two doc-string
inaccuracies the rebase introduced:
* internal/integration/alloc_budget_test.go: the header comment
above allocBudgetGrandfathered named only MDS025 and MDS026 as
the mid-fix entries, but the map also grandfathers MDS053 and
MDS054 at their plan-195-partial-fix ceilings. Rewrite the
comment to describe the map's invariant ("every grandfathered
rule has a plan 195 in-progress task entry") instead of
enumerating rules, so the comment stays correct as entries
shrink.
* internal/rules/nounusedlinkdefinitions/rule.go: the comment
above usedLabels claimed the lazy build "almost never fires"
on production files. That's backwards — the lazy path fires
the FIRST time a non-ignored, first-seen label reaches the
used-check, which is the common case. The actual saving is
narrow: the walk is skipped only when every refdef is ignored
or duplicated. Reword to describe the real behaviour.
Both are comment-only; no production behaviour or test
expectations change.
https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
PR #371 review: TestCachedBannedSet_InnerLockPath used time.Sleep(10ms) to force the goroutine into the mutex queue before the test populated the pointer. Under load (or on slower runners) the sleep may elapse before the goroutine reaches Lock(), making branch coverage timing-dependent. The right fix isn't a deterministic test hook — the inner double-checked-load it tests guards against a race-window cost so small it doesn't justify production complexity. A 4-entry banned-list rebuild is ~13 allocs, one-shot, only on the narrow window where two goroutines see the pointer nil before either acquires the mutex. The simpler shape — single-checked lock, accept the rare duplicate build — costs the same memory in the warm path (which is what every Check after the first hits anyway) and removes the testability gap. Remove the inner Load, drop the timing-dependent test, update the comment to describe the simpler invariant. Coverage stays at 100%; the gate fixture's MDS063 alloc count is unchanged. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
…/054 Pushes four more rules to the ≤ 10 alloc-per-Check ceiling and tightens the partial fixes already documented for MDS053/054. The grandfather list shrinks accordingly. * MDS036 max-section-length (12 → 0): adds a configured-no-knobs early-return that skips the heading/paragraph AST walks when every limit is zero. The opt-in default ships with every knob zero, so the gate measures 0 allocs. The paragraph index also only builds when a paragraph-aware limit is set. * MDS063 descriptive-link-text (17 → 4): lifts the bannedSet build off the per-File `File.Memo` and onto the rule instance behind `atomic.Pointer[map] + sync.Mutex` (double-checked-lock). The build (4 normalised banned phrases + map setup) now runs once per configured rule clone instead of once per Check. ApplySettings invalidates the pointer so a reconfigured Banned list rebuilds on the next read. * MDS053 no-unused-link-definitions (16 → 11): byte-scanner replaces refDefRE, recursive descent replaces collectUsedLabels' ast.Walk closure, `seen` map only allocates when len(defs) > 1, `referenceDefinition.label` becomes a `[]byte` aliased into source so collection adds no string copy, `labelInRefs` does a one-shot normalise + linear compare instead of the wanted-map literal. Remaining headroom hangs on a goldmark internal allocator documented in the plan; partial fix landed. * MDS054 no-undefined-reference-labels (21 → 13): introduces a unified `nextBracket` byte scanner that drives byte-level replacements for fullRefRE, collapsedRefRE, and shortcutRE plus refDefStartRE/collectRefDefLines. collectCodeSpanRanges moves off ast.Walk onto recursive descent. Partial fix; the residual `defs map` cost is the remaining lever. * MDS027 cross-file-reference-integrity (25 → 7): defers linkgraph.CollectAnchors and the per-Check anchorCache until a link actually needs them (the gate fixture's `[other](other.md)` has no anchor → both stay nil). Splits checkRelativeTarget into a cheap targetExists path so the heap-escaping read closure in resolveTargetFile only fires for cross-file Markdown anchor checks. Adds cachedAbs to fscache.go so resolveAbsRoot is a sync.Map.Load after the first cold call. * lint package: converts the five sync.Once-based memos (newlineOffsets, codeBlockLines, piBlockLines, linkRefs, gitignore-style) to the closure-free `atomic.Bool + sync.Mutex` pattern the per-Memo memoEntry already documents; converts the ast.Walk visitors in collectCodeBlockLines and collectPIBlockLines to recursive descent so the per-File memo build pays no closure box. Every rule that trips the lazy memos on a fresh File benefits. Each rule fix ships with its own alloc_test.go (≤ 10 ceiling) plus the race_off / race_on build-tag pair so the unit gate skips cleanly under -race. The grandfather entries for MDS027, MDS036, and MDS063 are gone; MDS053 (11) and MDS054 (13) are tightened to today's number so any regression past the post-partial baseline fails CI. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Both rules sat well over the per-Check ceiling because they paid for redundant work the parse already produced. * MDS029 conciseness-scoring (398 → 2): scoring a paragraph runs the embedded classifier whose regex-driven phrase/cue extraction dominates the rule's alloc budget. The rule already discards results below `MinWords`, but only AFTER running the classifier. Reordering so the cheap `mdtext.CountWords` byte scan gates the classifier call (paragraphs below MinWords cannot produce a diagnostic regardless of score) collapses the alloc cost to near zero on representative inputs. * MDS035 toc-directive (201 → 10): `hasTOCLinkReference` re-parsed the source with goldmark every fresh File to look up the `toc` link-reference entry; the File.Memo wrapper amortised within one Check, but each new File still paid for one extra parse. The goldmark parser the engine already ran for `NewFile` produces the same link-reference table — `f.LinkReferences()` exposes it directly. Switching to a linear scan over that table sheds the parse entirely. Both rules ship the standard `alloc_test.go` + race build-tag pair so the ≤ 10 ceiling is pinned per-package. The grandfather entries for MDS029 and MDS035 are gone; the remaining list is MDS025 (50), MDS026 (18), MDS053 (11), MDS054 (13). https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
The Allocation Budget section in docs/development/index.md previously just stated the ≤ 10 ceiling. Point it at the integration gate that enforces it so contributors can find the test (and the grandfather list) without grepping. Three included copies (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md) refresh via the existing `<?include?>` directive. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
BenchmarkCheckCorpusLarge lands at p95 = 188 ms / 314 µs per file on the current branch, well inside the plan's 12 s p95 acceptance bar. The BenchmarkParityGap one-off was a local measurement tool and is intentionally not committed. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Codecov flagged 36 uncovered lines on the plan-195 patch, mostly defensive guards and byte-scanner edge cases I added when replacing the regex paths. Add unit tests pinning each branch without changing production behaviour. * nounusedlinkdefinitions: tests for scanRefDefLine's leading spaces, missing-bracket, empty-label, missing-colon, empty destination, and \\r destination guards, plus the collectUsedLabelsInto nil-node guard. * noundefinedreferencelabels: tests for collectCodeSpanRangesInto nil-node, codeSpanTextBounds non-Text child skip, nextBracket orphan-open-bracket advance, the scanFullRefs second-bracket unclosed and first-bracket-no-adjacent paths, scanCollapsedRefs empty-label skip, and refDefLineStarts no-bracket / no-close / no-colon early returns. * crossfilereferenceintegrity: tests for the anchor-bearing path (resolveTargetFile-not-ok with both link-escapes-root and missing-target-in-root cases), targetExists FS-found and absolute-path branches, and resolveTargetFile fs.Stat failure. * descriptivelinktext: pins Category() and the cachedBannedSet double-checked-lock paths (outer fast path, inner race path driven by parallel callers across cleared-pointer iterations). * lint/codeblocks: nil-node guards on collectPIBlockLinesInto and collectCodeBlockLinesInto, plus a FindFencedOpenLine smoke test. Patch coverage for the changed packages is now 100% on nounusedlinkdefinitions, noundefinedreferencelabels, crossfilereferenceintegrity, descriptivelinktext, tocdirective, and maxsectionlength; lint/codeblocks.go has one defensive return (FindFencedOpenLine line 156) that is structurally unreachable from goldmark's parser. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
The previous race-based test relied on goroutine scheduling to interleave a parallel call with a populated pointer, which the scheduler doesn't reliably do under codecov's CI configuration — codecov reported the inner-Load branch as partially covered. Since rule_test.go is in the same package, the test can hold r.bannedSetMu directly: pre-acquire the mutex, kick off a goroutine that blocks on Lock inside cachedBannedSet, store a populated pointer, then release the mutex. The goroutine's inner Load now deterministically observes the populated pointer (line 141 branch covered) and returns it without rebuilding. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Copilot review on PR #371 flagged five issues; this commit addresses four of them. * MDS029 word-count gate (concisenessscoring/rule.go): the mdtext.CountWords gate diverges from the classifier's [a-z0-9']+ tokenization (the whitespace-only splitter counts `hello-world` as one token while the classifier sees two), so it could under-count and skip paragraphs the classifier would have flagged. Replace with a hand-rolled countClassifierTokens byte scan that matches the classifier's regex byte-for-byte. Still zero-alloc; behaviour is now conservative — the gate never skips a paragraph the classifier would have processed. * MDS035 hasTOCRef per-paragraph rescan (tocdirective/rule.go): hasTOCRef was called from CheckNode on every paragraph, so a large doc would rescan f.LinkReferences() N times. Defer the call until we actually find a TOC directive candidate (matchVariant returned ok=true) — most paragraphs have no candidate, so the call rarely fires; per-doc cost is now O(candidates) instead of O(paragraphs). * MDS053 labelInRefs comment (nounusedlinkdefinitions/rule.go): the old comment claimed Go's compiler elides the `string(b)` alloc in `s == string(b)`, but that optimisation only applies to the `m[string(b)]` lookup and `string(b) == "literal"` forms (the right side must be a literal). Replace the comparison with an open-coded stringEqualsBytes helper that is guaranteed alloc-free, and update the comment to describe what actually happens. * allocgate link target (docs/development/index.md): the `[allocgate]: ../../internal/integration/alloc_budget_test.go` reference resolves correctly from docs/development/index.md but breaks when mdsmith's <?include?> directive copies the content verbatim into root-level files (CLAUDE.md, AGENTS.md) and into .github/copilot-instructions.md, where the `../..` path escapes the repo. Drop the link and use inline code instead; the surrounding sentence already names the file unambiguously, and the gate is a short `grep` away from any contributor reading the section. The fifth review comment (the stray `+` artifact in plan/195_per-rule-alloc-budget.md) is left for a follow-up: the artifact was masking ~8 pre-existing readability warnings on unrelated tasks by accidentally splitting the numbered list, and removing it (without rewording every affected task) would trigger MDS023 across the file. A separate plan-rework pass will address both the artifact and the surrounding readability. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Plan 195's task list had a literal `+` on its own line in task 9 (a diff artifact from an earlier conflict resolution). The `+` acted as a Markdown list marker, splitting the surrounding numbered list into multiple sub-lists. The split was hiding MDS023 / MDS024 readability warnings on tasks 1, 5, 6, 7, 9, 13, and 15 — those tasks had grown long sentences during the plan's evolution but goldmark's loose/tight list semantics treated them as inline text rather than paragraphs, so the readability rules skipped them. Remove the `+` and reword the affected tasks to fit MDS023's readability index (≤ 14.0 on the LIX / FRES scoring mdsmith uses) and MDS024's per-paragraph sentence cap (≤ 6). Content unchanged otherwise — every fix description still names the same files, the same allocator wins, and the same deferred follow-ups. Resolves PR #371 review thread (×3 duplicates) about the stray `+` artifact. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Copilot review on the rebased branch flagged two doc-string
inaccuracies the rebase introduced:
* internal/integration/alloc_budget_test.go: the header comment
above allocBudgetGrandfathered named only MDS025 and MDS026 as
the mid-fix entries, but the map also grandfathers MDS053 and
MDS054 at their plan-195-partial-fix ceilings. Rewrite the
comment to describe the map's invariant ("every grandfathered
rule has a plan 195 in-progress task entry") instead of
enumerating rules, so the comment stays correct as entries
shrink.
* internal/rules/nounusedlinkdefinitions/rule.go: the comment
above usedLabels claimed the lazy build "almost never fires"
on production files. That's backwards — the lazy path fires
the FIRST time a non-ignored, first-seen label reaches the
used-check, which is the common case. The actual saving is
narrow: the walk is skipped only when every refdef is ignored
or duplicated. Reword to describe the real behaviour.
Both are comment-only; no production behaviour or test
expectations change.
https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
PR #371 review: TestCachedBannedSet_InnerLockPath used time.Sleep(10ms) to force the goroutine into the mutex queue before the test populated the pointer. Under load (or on slower runners) the sleep may elapse before the goroutine reaches Lock(), making branch coverage timing-dependent. The right fix isn't a deterministic test hook — the inner double-checked-load it tests guards against a race-window cost so small it doesn't justify production complexity. A 4-entry banned-list rebuild is ~13 allocs, one-shot, only on the narrow window where two goroutines see the pointer nil before either acquires the mutex. The simpler shape — single-checked lock, accept the rare duplicate build — costs the same memory in the warm path (which is what every Check after the first hits anyway) and removes the testability gap. Remove the inner Load, drop the timing-dependent test, update the comment to describe the simpler invariant. Coverage stays at 100%; the gate fixture's MDS063 alloc count is unchanged. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
The previous commit dropped the double-checked-lock inner Load but left an out-of-date test name and a gofmt nit. Run gofmt and rename TestCachedBannedSet_DoubleCheckedLockHits to TestCachedBannedSet_WarmPathSkipsMutex with a comment that matches the new single-checked shape. https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
7dddf8f to
dfc55b8
Compare
Implements plan 195 tasks 5–7 and 9, 12–14 to bring six rules under the 10-allocation/Check budget by eliminating regex scans, lazy-building expensive structures, and replacing closure-based AST walks with recursive descent.
Summary
This PR reduces heap allocations across six rules by refactoring hot paths to avoid per-Check regex compilation, deferring expensive AST walks until needed, and replacing
sync.Onceclosures with atomic-bool + mutex pairs. The changes maintain identical rule behavior while cutting allocations by 40–70% on representative input.Key Changes
MDS027 (cross-file-reference-integrity): 25 → 7 allocs
linkgraph.CollectAnchors(self)and per-CheckanchorCachemap until the first link that actually needs themcheckRelativeTargetinto a cheaptargetExistspath that skips the heap-escaping read closure when the link has no anchorcachedAbsRoottofscache.goso per-CheckresolveAbsRootcalls become async.Maphit after the first cold callcheckCtxstruct to thread context through link checks without allocating per-linkMDS053 (no-unused-link-definitions): 16 → 11 allocs
regexp.FindAllSubmatchIndexper-file scan with inline byte scanner (−3 allocs)wantedmap literal in favor of a linear scan overf.LinkReferences()(−1)seenmap only whenlen(defs) > 1(−1)[]bytealiased intof.SourcesoreferenceDefinitioncollection adds no per-def string copy (−1)collectUsedLabels'sast.Walkclosure into recursive descent (−1)MDS054 (no-undefined-reference-labels): 21 → 13 allocs
fullRefRE,collapsedRefRE,shortcutRE) with a singlenextBracketbyte scannercollectCodeSpanRangesfromast.Walkclosure into recursive descent helpercodeLinesandpiLinesmaps directly instead of wrapping in anexcludedclosureMDS063 (descriptive-link-text): ~17 → 4 allocs
bannedSetcache from per-File memo to rule instance behind double-checked-lock patternMDS035 (toc-directive): ~200+ → <10 allocs
hasTOCLinkReference(which re-parsed the entire source) with a linear scan overf.LinkReferences()MDS036 (max-section-length): ~12 → 0 allocs (opt-in path)
Shared infrastructure (lint/file.go, lint/codeblocks.go)
sync.Oncewithatomic.Bool + sync.Mutexpairs to avoid per-call closure allocationCollectCodeBlockLinesandCollectPIBlockLinesfromast.Walkclosures into recursive descentImplementation Details
FindAllSubmatchIndexwith hand-rollednextBrackethelper that scans for[X]patterns without allocating result slices or per-match `https://claude.ai/code/session_01P7s7qwyMbt2sZLcrK8Nfec