Skip to content

Plan 195: Reduce per-Check allocations across 6 rules#371

Merged
jeduden merged 11 commits into
mainfrom
claude/lucid-newton-Q7mN9
May 24, 2026
Merged

Plan 195: Reduce per-Check allocations across 6 rules#371
jeduden merged 11 commits into
mainfrom
claude/lucid-newton-Q7mN9

Conversation

@jeduden
Copy link
Copy Markdown
Owner

@jeduden jeduden commented May 22, 2026

Implements plan 195 tasks 5–7 and 9, 12–14 to bring six rules under the 10-allocation/Check budget by eliminating regex scans, lazy-building expensive structures, and replacing closure-based AST walks with recursive descent.

Summary

This PR reduces heap allocations across six rules by refactoring hot paths to avoid per-Check regex compilation, deferring expensive AST walks until needed, and replacing sync.Once closures with atomic-bool + mutex pairs. The changes maintain identical rule behavior while cutting allocations by 40–70% on representative input.

Key Changes

MDS027 (cross-file-reference-integrity): 25 → 7 allocs

  • Defers linkgraph.CollectAnchors(self) and per-Check anchorCache map until the first link that actually needs them
  • Splits checkRelativeTarget into a cheap targetExists path that skips the heap-escaping read closure when the link has no anchor
  • Adds cachedAbsRoot to fscache.go so per-Check resolveAbsRoot calls become a sync.Map hit after the first cold call
  • Introduces checkCtx struct to thread context through link checks without allocating per-link

MDS053 (no-unused-link-definitions): 16 → 11 allocs

  • Replaces regexp.FindAllSubmatchIndex per-file scan with inline byte scanner (−3 allocs)
  • Drops the wanted map literal in favor of a linear scan over f.LinkReferences() (−1)
  • Lazy-builds the seen map only when len(defs) > 1 (−1)
  • Stores label as []byte aliased into f.Source so referenceDefinition collection adds no per-def string copy (−1)
  • Unwinds collectUsedLabels's ast.Walk closure into recursive descent (−1)

MDS054 (no-undefined-reference-labels): 21 → 13 allocs

  • Replaces three regexes (fullRefRE, collapsedRefRE, shortcutRE) with a single nextBracket byte scanner
  • Unwinds collectCodeSpanRanges from ast.Walk closure into recursive descent helper
  • Passes codeLines and piLines maps directly instead of wrapping in an excluded closure

MDS063 (descriptive-link-text): ~17 → 4 allocs

  • Moves bannedSet cache from per-File memo to rule instance behind double-checked-lock pattern
  • Eliminates per-Check map allocation when the rule is unchanged between calls

MDS035 (toc-directive): ~200+ → <10 allocs

  • Replaces hasTOCLinkReference (which re-parsed the entire source) with a linear scan over f.LinkReferences()
  • Eliminates the per-File memo wrapper and goldmark re-parse overhead

MDS036 (max-section-length): ~12 → 0 allocs (opt-in path)

  • Adds fast path that skips AST walks for headings and paragraphs when all limit knobs are zero
  • Preserves the configured-no-knobs default (production case) with zero allocations

Shared infrastructure (lint/file.go, lint/codeblocks.go)

  • Replaces sync.Once with atomic.Bool + sync.Mutex pairs to avoid per-call closure allocation
  • Unwinds CollectCodeBlockLines and CollectPIBlockLines from ast.Walk closures into recursive descent
  • Adds per-File memos for code-block and PI-block line sets (already cached, now closure-free)

Implementation Details

  • Byte-level scanning: Replaces regex FindAllSubmatchIndex with hand-rolled nextBracket helper that scans for [X] patterns without allocating result slices or per-match `

https://claude.ai/code/session_01P7s7qwyMbt2sZLcrK8Nfec

Copilot AI review requested due to automatic review settings May 22, 2026 18:59
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 99.49875% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.49%. Comparing base (8c5d969) to head (dfc55b8).

Files with missing lines Patch % Lines
internal/rules/nounusedlinkdefinitions/rule.go 98.36% 1 Missing and 1 partial ⚠️
Additional details and impacted files
Components Coverage Δ
Go 96.46% <99.49%> (+0.02%) ⬆️
TypeScript 99.39% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements Plan 195’s allocation-reduction work by refactoring several rules and shared lint infrastructure to avoid per-Check regex work, reduce lazy-init closure allocations, and add/adjust allocation-budget gates.

Changes:

  • Refactors multiple rules (e.g., MDS035, MDS053, MDS054, MDS027, MDS063, MDS036, MDS029) to reduce per-Check allocations via byte scanning, lazy building, and shared cached state.
  • Replaces several sync.Once+closure lazy inits with atomic.Bool+sync.Mutex patterns in lint.File and code-block line collection to avoid closure allocations.
  • Adds per-rule alloc budget unit tests (skipping -race/-short) and tightens the integration alloc-budget grandfather baselines.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
plan/195_per-rule-alloc-budget.md Updates plan task statuses and records allocation results/approach.
internal/rules/tocdirective/rule.go Removes per-file re-parse and uses f.LinkReferences() to detect TOC ref.
internal/rules/tocdirective/rule_test.go Updates helper test to use lint.NewFile + new helper name.
internal/rules/tocdirective/alloc_test.go Adds unit alloc-budget gate for MDS035.
internal/rules/tocdirective/race_on_test.go Build-tag constant for -race skipping.
internal/rules/tocdirective/race_off_test.go Build-tag constant for non--race runs.
internal/rules/nounusedlinkdefinitions/rule.go Replaces regex scan with byte scanning; reduces per-check allocations; lazy maps.
internal/rules/nounusedlinkdefinitions/alloc_test.go Adds unit alloc-budget gate for MDS053 (partial ceiling).
internal/rules/nounusedlinkdefinitions/race_on_test.go Build-tag constant for -race skipping.
internal/rules/nounusedlinkdefinitions/race_off_test.go Build-tag constant for non--race runs.
internal/rules/noundefinedreferencelabels/rule.go Replaces regex ref scanning with bracket byte scanner + recursive AST descent helpers.
internal/rules/maxsectionlength/rule.go Adds early-return fast path when all limits are unset; avoids paragraph index when unused.
internal/rules/maxsectionlength/alloc_test.go Adds unit alloc-budget gate + “no work” behaviour test for no-knobs path.
internal/rules/maxsectionlength/race_on_test.go Build-tag constant for -race skipping.
internal/rules/maxsectionlength/race_off_test.go Build-tag constant for non--race runs.
internal/rules/descriptivelinktext/rule.go Moves banned-set cache onto rule instance with atomic-pointer + mutex.
internal/rules/descriptivelinktext/rule_test.go Updates cache contract test to reflect per-rule memoization + invalidation.
internal/rules/descriptivelinktext/alloc_test.go Adds unit alloc-budget gate for MDS063.
internal/rules/descriptivelinktext/race_on_test.go Build-tag constant for -race skipping.
internal/rules/descriptivelinktext/race_off_test.go Build-tag constant for non--race runs.
internal/rules/crossfilereferenceintegrity/rule.go Introduces per-check context + lazy anchors/cache + existence-only path for non-fragment links.
internal/rules/crossfilereferenceintegrity/fscache.go Adds cached filepath.Abs memoization for resolved roots.
internal/rules/crossfilereferenceintegrity/alloc_test.go Adds unit alloc-budget gate for MDS027.
internal/rules/crossfilereferenceintegrity/race_on_test.go Build-tag constant for -race skipping.
internal/rules/crossfilereferenceintegrity/race_off_test.go Build-tag constant for non--race runs.
internal/rules/concisenessscoring/rule.go Adds word-count short-circuit before running the classifier.
internal/rules/concisenessscoring/alloc_test.go Adds unit alloc-budget gate for MDS029.
internal/rules/concisenessscoring/race_on_test.go Build-tag constant for -race skipping.
internal/rules/concisenessscoring/race_off_test.go Build-tag constant for non--race runs.
internal/lint/file.go Replaces sync.Once with atomic.Bool+sync.Mutex for several per-file memos.
internal/lint/codeblocks.go Reworks code/PI block line collection to avoid ast.Walk closures and once.Do closures.
internal/integration/alloc_budget_test.go Tightens grandfathered baselines and removes rules now under the ceiling.
docs/development/index.md Updates allocation budget docs to point to the alloc gate test.
CLAUDE.md Updates allocation budget docs to point to the alloc gate test.
AGENTS.md Updates allocation budget docs to point to the alloc gate test.
.github/copilot-instructions.md Updates allocation budget docs to point to the alloc gate test.

Comment thread internal/rules/concisenessscoring/rule.go Outdated
Comment thread CLAUDE.md Outdated
Comment thread AGENTS.md Outdated
Comment thread .github/copilot-instructions.md Outdated
Comment thread internal/rules/nounusedlinkdefinitions/rule.go Outdated
Comment thread plan/195_per-rule-alloc-budget.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

.github/copilot-instructions.md:249

  • The alloc-gate link target is incorrect for .github/copilot-instructions.md: ../../internal/... goes up two directories from .github/ and ends up outside the repo. Use a path that’s correct relative to .github/ (e.g., ../internal/integration/alloc_budget_test.go).
allocate 0–6.

[allocgate]: ../../internal/integration/alloc_budget_test.go

- Walk `f.Lines` / `f.AST` directly.
- Prefer `bytes.IndexByte` / `bytes.Contains` over
  `regexp` for fixed searches.
- Compile every `regexp.Regexp` at package scope.
- Pre-size slices with `make([]X, 0, n)`.

Comment thread plan/195_per-rule-alloc-budget.md Outdated
Comment thread CLAUDE.md Outdated
Comment thread AGENTS.md
Comment thread internal/rules/tocdirective/rule.go
jeduden pushed a commit that referenced this pull request May 23, 2026
Copilot review on PR #371 flagged five issues; this commit addresses
four of them.

* MDS029 word-count gate (concisenessscoring/rule.go): the
  mdtext.CountWords gate diverges from the classifier's
  [a-z0-9']+ tokenization (the whitespace-only splitter counts
  `hello-world` as one token while the classifier sees two), so
  it could under-count and skip paragraphs the classifier would
  have flagged. Replace with a hand-rolled
  countClassifierTokens byte scan that matches the classifier's
  regex byte-for-byte. Still zero-alloc; behaviour is now
  conservative — the gate never skips a paragraph the
  classifier would have processed.

* MDS035 hasTOCRef per-paragraph rescan (tocdirective/rule.go):
  hasTOCRef was called from CheckNode on every paragraph, so a
  large doc would rescan f.LinkReferences() N times. Defer the
  call until we actually find a TOC directive candidate
  (matchVariant returned ok=true) — most paragraphs have no
  candidate, so the call rarely fires; per-doc cost is now
  O(candidates) instead of O(paragraphs).

* MDS053 labelInRefs comment (nounusedlinkdefinitions/rule.go):
  the old comment claimed Go's compiler elides the `string(b)`
  alloc in `s == string(b)`, but that optimisation only applies
  to the `m[string(b)]` lookup and `string(b) == "literal"`
  forms (the right side must be a literal). Replace the
  comparison with an open-coded stringEqualsBytes helper that
  is guaranteed alloc-free, and update the comment to describe
  what actually happens.

* allocgate link target (docs/development/index.md): the
  `[allocgate]: ../../internal/integration/alloc_budget_test.go`
  reference resolves correctly from docs/development/index.md
  but breaks when mdsmith's <?include?> directive copies the
  content verbatim into root-level files (CLAUDE.md, AGENTS.md)
  and into .github/copilot-instructions.md, where the `../..`
  path escapes the repo. Drop the link and use inline code
  instead; the surrounding sentence already names the file
  unambiguously, and the gate is a short `grep` away from any
  contributor reading the section.

The fifth review comment (the stray `+` artifact in
plan/195_per-rule-alloc-budget.md) is left for a follow-up: the
artifact was masking ~8 pre-existing readability warnings on
unrelated tasks by accidentally splitting the numbered list,
and removing it (without rewording every affected task) would
trigger MDS023 across the file. A separate plan-rework pass
will address both the artifact and the surrounding readability.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
@jeduden jeduden requested a review from Copilot May 23, 2026 09:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 1 comment.

Comment thread plan/195_per-rule-alloc-budget.md Outdated
jeduden pushed a commit that referenced this pull request May 23, 2026
Plan 195's task list had a literal `+` on its own line in task 9
(a diff artifact from an earlier conflict resolution). The `+`
acted as a Markdown list marker, splitting the surrounding
numbered list into multiple sub-lists. The split was hiding
MDS023 / MDS024 readability warnings on tasks 1, 5, 6, 7, 9, 13,
and 15 — those tasks had grown long sentences during the plan's
evolution but goldmark's loose/tight list semantics treated them
as inline text rather than paragraphs, so the readability rules
skipped them.

Remove the `+` and reword the affected tasks to fit MDS023's
readability index (≤ 14.0 on the LIX / FRES scoring mdsmith uses)
and MDS024's per-paragraph sentence cap (≤ 6). Content unchanged
otherwise — every fix description still names the same files,
the same allocator wins, and the same deferred follow-ups.

Resolves PR #371 review thread (×3 duplicates) about the stray
`+` artifact.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
@jeduden jeduden requested a review from Copilot May 23, 2026 13:15
jeduden pushed a commit that referenced this pull request May 23, 2026
Copilot review on PR #371 flagged five issues; this commit addresses
four of them.

* MDS029 word-count gate (concisenessscoring/rule.go): the
  mdtext.CountWords gate diverges from the classifier's
  [a-z0-9']+ tokenization (the whitespace-only splitter counts
  `hello-world` as one token while the classifier sees two), so
  it could under-count and skip paragraphs the classifier would
  have flagged. Replace with a hand-rolled
  countClassifierTokens byte scan that matches the classifier's
  regex byte-for-byte. Still zero-alloc; behaviour is now
  conservative — the gate never skips a paragraph the
  classifier would have processed.

* MDS035 hasTOCRef per-paragraph rescan (tocdirective/rule.go):
  hasTOCRef was called from CheckNode on every paragraph, so a
  large doc would rescan f.LinkReferences() N times. Defer the
  call until we actually find a TOC directive candidate
  (matchVariant returned ok=true) — most paragraphs have no
  candidate, so the call rarely fires; per-doc cost is now
  O(candidates) instead of O(paragraphs).

* MDS053 labelInRefs comment (nounusedlinkdefinitions/rule.go):
  the old comment claimed Go's compiler elides the `string(b)`
  alloc in `s == string(b)`, but that optimisation only applies
  to the `m[string(b)]` lookup and `string(b) == "literal"`
  forms (the right side must be a literal). Replace the
  comparison with an open-coded stringEqualsBytes helper that
  is guaranteed alloc-free, and update the comment to describe
  what actually happens.

* allocgate link target (docs/development/index.md): the
  `[allocgate]: ../../internal/integration/alloc_budget_test.go`
  reference resolves correctly from docs/development/index.md
  but breaks when mdsmith's <?include?> directive copies the
  content verbatim into root-level files (CLAUDE.md, AGENTS.md)
  and into .github/copilot-instructions.md, where the `../..`
  path escapes the repo. Drop the link and use inline code
  instead; the surrounding sentence already names the file
  unambiguously, and the gate is a short `grep` away from any
  contributor reading the section.

The fifth review comment (the stray `+` artifact in
plan/195_per-rule-alloc-budget.md) is left for a follow-up: the
artifact was masking ~8 pre-existing readability warnings on
unrelated tasks by accidentally splitting the numbered list,
and removing it (without rewording every affected task) would
trigger MDS023 across the file. A separate plan-rework pass
will address both the artifact and the surrounding readability.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
jeduden pushed a commit that referenced this pull request May 23, 2026
Plan 195's task list had a literal `+` on its own line in task 9
(a diff artifact from an earlier conflict resolution). The `+`
acted as a Markdown list marker, splitting the surrounding
numbered list into multiple sub-lists. The split was hiding
MDS023 / MDS024 readability warnings on tasks 1, 5, 6, 7, 9, 13,
and 15 — those tasks had grown long sentences during the plan's
evolution but goldmark's loose/tight list semantics treated them
as inline text rather than paragraphs, so the readability rules
skipped them.

Remove the `+` and reword the affected tasks to fit MDS023's
readability index (≤ 14.0 on the LIX / FRES scoring mdsmith uses)
and MDS024's per-paragraph sentence cap (≤ 6). Content unchanged
otherwise — every fix description still names the same files,
the same allocator wins, and the same deferred follow-ups.

Resolves PR #371 review thread (×3 duplicates) about the stray
`+` artifact.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
@jeduden jeduden force-pushed the claude/lucid-newton-Q7mN9 branch from 1f1ed5f to 838dc1b Compare May 23, 2026 13:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 2 comments.

Comment thread internal/integration/alloc_budget_test.go
Comment thread internal/rules/nounusedlinkdefinitions/rule.go Outdated
jeduden pushed a commit that referenced this pull request May 23, 2026
Copilot review on the rebased branch flagged two doc-string
inaccuracies the rebase introduced:

* internal/integration/alloc_budget_test.go: the header comment
  above allocBudgetGrandfathered named only MDS025 and MDS026 as
  the mid-fix entries, but the map also grandfathers MDS053 and
  MDS054 at their plan-195-partial-fix ceilings. Rewrite the
  comment to describe the map's invariant ("every grandfathered
  rule has a plan 195 in-progress task entry") instead of
  enumerating rules, so the comment stays correct as entries
  shrink.

* internal/rules/nounusedlinkdefinitions/rule.go: the comment
  above usedLabels claimed the lazy build "almost never fires"
  on production files. That's backwards — the lazy path fires
  the FIRST time a non-ignored, first-seen label reaches the
  used-check, which is the common case. The actual saving is
  narrow: the walk is skipped only when every refdef is ignored
  or duplicated. Reword to describe the real behaviour.

Both are comment-only; no production behaviour or test
expectations change.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
@jeduden jeduden requested a review from Copilot May 23, 2026 18:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 1 comment.

Comment thread internal/rules/descriptivelinktext/rule_test.go Outdated
jeduden pushed a commit that referenced this pull request May 23, 2026
PR #371 review: TestCachedBannedSet_InnerLockPath used
time.Sleep(10ms) to force the goroutine into the mutex queue
before the test populated the pointer. Under load (or on slower
runners) the sleep may elapse before the goroutine reaches
Lock(), making branch coverage timing-dependent.

The right fix isn't a deterministic test hook — the inner
double-checked-load it tests guards against a race-window cost
so small it doesn't justify production complexity. A 4-entry
banned-list rebuild is ~13 allocs, one-shot, only on the
narrow window where two goroutines see the pointer nil before
either acquires the mutex. The simpler shape — single-checked
lock, accept the rare duplicate build — costs the same memory
in the warm path (which is what every Check after the first
hits anyway) and removes the testability gap.

Remove the inner Load, drop the timing-dependent test, update
the comment to describe the simpler invariant. Coverage stays
at 100%; the gate fixture's MDS063 alloc count is unchanged.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
@jeduden jeduden requested a review from Copilot May 23, 2026 23:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated no new comments.

claude added 4 commits May 24, 2026 02:17
…/054

Pushes four more rules to the ≤ 10 alloc-per-Check ceiling and tightens
the partial fixes already documented for MDS053/054. The grandfather
list shrinks accordingly.

* MDS036 max-section-length (12 → 0): adds a configured-no-knobs
  early-return that skips the heading/paragraph AST walks when every
  limit is zero. The opt-in default ships with every knob zero, so
  the gate measures 0 allocs. The paragraph index also only builds
  when a paragraph-aware limit is set.

* MDS063 descriptive-link-text (17 → 4): lifts the bannedSet build
  off the per-File `File.Memo` and onto the rule instance behind
  `atomic.Pointer[map] + sync.Mutex` (double-checked-lock). The
  build (4 normalised banned phrases + map setup) now runs once
  per configured rule clone instead of once per Check.
  ApplySettings invalidates the pointer so a reconfigured Banned
  list rebuilds on the next read.

* MDS053 no-unused-link-definitions (16 → 11): byte-scanner
  replaces refDefRE, recursive descent replaces collectUsedLabels'
  ast.Walk closure, `seen` map only allocates when len(defs) > 1,
  `referenceDefinition.label` becomes a `[]byte` aliased into
  source so collection adds no string copy, `labelInRefs` does a
  one-shot normalise + linear compare instead of the wanted-map
  literal. Remaining headroom hangs on a goldmark internal allocator
  documented in the plan; partial fix landed.

* MDS054 no-undefined-reference-labels (21 → 13): introduces a
  unified `nextBracket` byte scanner that drives byte-level
  replacements for fullRefRE, collapsedRefRE, and shortcutRE plus
  refDefStartRE/collectRefDefLines. collectCodeSpanRanges moves
  off ast.Walk onto recursive descent. Partial fix; the residual
  `defs map` cost is the remaining lever.

* MDS027 cross-file-reference-integrity (25 → 7): defers
  linkgraph.CollectAnchors and the per-Check anchorCache until a
  link actually needs them (the gate fixture's `[other](other.md)`
  has no anchor → both stay nil). Splits checkRelativeTarget into
  a cheap targetExists path so the heap-escaping read closure in
  resolveTargetFile only fires for cross-file Markdown anchor
  checks. Adds cachedAbs to fscache.go so resolveAbsRoot is a
  sync.Map.Load after the first cold call.

* lint package: converts the five sync.Once-based memos
  (newlineOffsets, codeBlockLines, piBlockLines, linkRefs,
  gitignore-style) to the closure-free `atomic.Bool + sync.Mutex`
  pattern the per-Memo memoEntry already documents; converts the
  ast.Walk visitors in collectCodeBlockLines and collectPIBlockLines
  to recursive descent so the per-File memo build pays no closure
  box. Every rule that trips the lazy memos on a fresh File
  benefits.

Each rule fix ships with its own alloc_test.go (≤ 10 ceiling) plus the
race_off / race_on build-tag pair so the unit gate skips cleanly under
-race. The grandfather entries for MDS027, MDS036, and MDS063 are gone;
MDS053 (11) and MDS054 (13) are tightened to today's number so any
regression past the post-partial baseline fails CI.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Both rules sat well over the per-Check ceiling because they paid
for redundant work the parse already produced.

* MDS029 conciseness-scoring (398 → 2): scoring a paragraph runs
  the embedded classifier whose regex-driven phrase/cue extraction
  dominates the rule's alloc budget. The rule already discards
  results below `MinWords`, but only AFTER running the classifier.
  Reordering so the cheap `mdtext.CountWords` byte scan gates the
  classifier call (paragraphs below MinWords cannot produce a
  diagnostic regardless of score) collapses the alloc cost to near
  zero on representative inputs.

* MDS035 toc-directive (201 → 10): `hasTOCLinkReference` re-parsed
  the source with goldmark every fresh File to look up the `toc`
  link-reference entry; the File.Memo wrapper amortised within one
  Check, but each new File still paid for one extra parse. The
  goldmark parser the engine already ran for `NewFile` produces
  the same link-reference table — `f.LinkReferences()` exposes it
  directly. Switching to a linear scan over that table sheds the
  parse entirely.

Both rules ship the standard `alloc_test.go` + race build-tag pair so
the ≤ 10 ceiling is pinned per-package. The grandfather entries for
MDS029 and MDS035 are gone; the remaining list is MDS025 (50),
MDS026 (18), MDS053 (11), MDS054 (13).

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
The Allocation Budget section in docs/development/index.md previously
just stated the ≤ 10 ceiling. Point it at the integration gate that
enforces it so contributors can find the test (and the grandfather
list) without grepping. Three included copies (CLAUDE.md, AGENTS.md,
.github/copilot-instructions.md) refresh via the existing
`<?include?>` directive.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
BenchmarkCheckCorpusLarge lands at p95 = 188 ms / 314 µs per
file on the current branch, well inside the plan's 12 s p95
acceptance bar. The BenchmarkParityGap one-off was a local
measurement tool and is intentionally not committed.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
claude added 7 commits May 24, 2026 02:17
Codecov flagged 36 uncovered lines on the plan-195 patch, mostly
defensive guards and byte-scanner edge cases I added when
replacing the regex paths. Add unit tests pinning each branch
without changing production behaviour.

* nounusedlinkdefinitions: tests for scanRefDefLine's leading
  spaces, missing-bracket, empty-label, missing-colon, empty
  destination, and \\r destination guards, plus the
  collectUsedLabelsInto nil-node guard.

* noundefinedreferencelabels: tests for collectCodeSpanRangesInto
  nil-node, codeSpanTextBounds non-Text child skip, nextBracket
  orphan-open-bracket advance, the scanFullRefs second-bracket
  unclosed and first-bracket-no-adjacent paths, scanCollapsedRefs
  empty-label skip, and refDefLineStarts no-bracket / no-close /
  no-colon early returns.

* crossfilereferenceintegrity: tests for the anchor-bearing path
  (resolveTargetFile-not-ok with both link-escapes-root and
  missing-target-in-root cases), targetExists FS-found and
  absolute-path branches, and resolveTargetFile fs.Stat failure.

* descriptivelinktext: pins Category() and the cachedBannedSet
  double-checked-lock paths (outer fast path, inner race path
  driven by parallel callers across cleared-pointer iterations).

* lint/codeblocks: nil-node guards on collectPIBlockLinesInto and
  collectCodeBlockLinesInto, plus a FindFencedOpenLine smoke test.

Patch coverage for the changed packages is now 100% on
nounusedlinkdefinitions, noundefinedreferencelabels,
crossfilereferenceintegrity, descriptivelinktext, tocdirective,
and maxsectionlength; lint/codeblocks.go has one defensive
return (FindFencedOpenLine line 156) that is structurally
unreachable from goldmark's parser.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
The previous race-based test relied on goroutine scheduling to
interleave a parallel call with a populated pointer, which the
scheduler doesn't reliably do under codecov's CI configuration —
codecov reported the inner-Load branch as partially covered.

Since rule_test.go is in the same package, the test can hold
r.bannedSetMu directly: pre-acquire the mutex, kick off a
goroutine that blocks on Lock inside cachedBannedSet, store a
populated pointer, then release the mutex. The goroutine's
inner Load now deterministically observes the populated
pointer (line 141 branch covered) and returns it without
rebuilding.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Copilot review on PR #371 flagged five issues; this commit addresses
four of them.

* MDS029 word-count gate (concisenessscoring/rule.go): the
  mdtext.CountWords gate diverges from the classifier's
  [a-z0-9']+ tokenization (the whitespace-only splitter counts
  `hello-world` as one token while the classifier sees two), so
  it could under-count and skip paragraphs the classifier would
  have flagged. Replace with a hand-rolled
  countClassifierTokens byte scan that matches the classifier's
  regex byte-for-byte. Still zero-alloc; behaviour is now
  conservative — the gate never skips a paragraph the
  classifier would have processed.

* MDS035 hasTOCRef per-paragraph rescan (tocdirective/rule.go):
  hasTOCRef was called from CheckNode on every paragraph, so a
  large doc would rescan f.LinkReferences() N times. Defer the
  call until we actually find a TOC directive candidate
  (matchVariant returned ok=true) — most paragraphs have no
  candidate, so the call rarely fires; per-doc cost is now
  O(candidates) instead of O(paragraphs).

* MDS053 labelInRefs comment (nounusedlinkdefinitions/rule.go):
  the old comment claimed Go's compiler elides the `string(b)`
  alloc in `s == string(b)`, but that optimisation only applies
  to the `m[string(b)]` lookup and `string(b) == "literal"`
  forms (the right side must be a literal). Replace the
  comparison with an open-coded stringEqualsBytes helper that
  is guaranteed alloc-free, and update the comment to describe
  what actually happens.

* allocgate link target (docs/development/index.md): the
  `[allocgate]: ../../internal/integration/alloc_budget_test.go`
  reference resolves correctly from docs/development/index.md
  but breaks when mdsmith's <?include?> directive copies the
  content verbatim into root-level files (CLAUDE.md, AGENTS.md)
  and into .github/copilot-instructions.md, where the `../..`
  path escapes the repo. Drop the link and use inline code
  instead; the surrounding sentence already names the file
  unambiguously, and the gate is a short `grep` away from any
  contributor reading the section.

The fifth review comment (the stray `+` artifact in
plan/195_per-rule-alloc-budget.md) is left for a follow-up: the
artifact was masking ~8 pre-existing readability warnings on
unrelated tasks by accidentally splitting the numbered list,
and removing it (without rewording every affected task) would
trigger MDS023 across the file. A separate plan-rework pass
will address both the artifact and the surrounding readability.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Plan 195's task list had a literal `+` on its own line in task 9
(a diff artifact from an earlier conflict resolution). The `+`
acted as a Markdown list marker, splitting the surrounding
numbered list into multiple sub-lists. The split was hiding
MDS023 / MDS024 readability warnings on tasks 1, 5, 6, 7, 9, 13,
and 15 — those tasks had grown long sentences during the plan's
evolution but goldmark's loose/tight list semantics treated them
as inline text rather than paragraphs, so the readability rules
skipped them.

Remove the `+` and reword the affected tasks to fit MDS023's
readability index (≤ 14.0 on the LIX / FRES scoring mdsmith uses)
and MDS024's per-paragraph sentence cap (≤ 6). Content unchanged
otherwise — every fix description still names the same files,
the same allocator wins, and the same deferred follow-ups.

Resolves PR #371 review thread (×3 duplicates) about the stray
`+` artifact.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
Copilot review on the rebased branch flagged two doc-string
inaccuracies the rebase introduced:

* internal/integration/alloc_budget_test.go: the header comment
  above allocBudgetGrandfathered named only MDS025 and MDS026 as
  the mid-fix entries, but the map also grandfathers MDS053 and
  MDS054 at their plan-195-partial-fix ceilings. Rewrite the
  comment to describe the map's invariant ("every grandfathered
  rule has a plan 195 in-progress task entry") instead of
  enumerating rules, so the comment stays correct as entries
  shrink.

* internal/rules/nounusedlinkdefinitions/rule.go: the comment
  above usedLabels claimed the lazy build "almost never fires"
  on production files. That's backwards — the lazy path fires
  the FIRST time a non-ignored, first-seen label reaches the
  used-check, which is the common case. The actual saving is
  narrow: the walk is skipped only when every refdef is ignored
  or duplicated. Reword to describe the real behaviour.

Both are comment-only; no production behaviour or test
expectations change.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
PR #371 review: TestCachedBannedSet_InnerLockPath used
time.Sleep(10ms) to force the goroutine into the mutex queue
before the test populated the pointer. Under load (or on slower
runners) the sleep may elapse before the goroutine reaches
Lock(), making branch coverage timing-dependent.

The right fix isn't a deterministic test hook — the inner
double-checked-load it tests guards against a race-window cost
so small it doesn't justify production complexity. A 4-entry
banned-list rebuild is ~13 allocs, one-shot, only on the
narrow window where two goroutines see the pointer nil before
either acquires the mutex. The simpler shape — single-checked
lock, accept the rare duplicate build — costs the same memory
in the warm path (which is what every Check after the first
hits anyway) and removes the testability gap.

Remove the inner Load, drop the timing-dependent test, update
the comment to describe the simpler invariant. Coverage stays
at 100%; the gate fixture's MDS063 alloc count is unchanged.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
The previous commit dropped the double-checked-lock inner Load
but left an out-of-date test name and a gofmt nit. Run gofmt
and rename TestCachedBannedSet_DoubleCheckedLockHits to
TestCachedBannedSet_WarmPathSkipsMutex with a comment that
matches the new single-checked shape.

https://claude.ai/code/session_019mc2vpRhysU5tD6FDVnzWL
@jeduden jeduden force-pushed the claude/lucid-newton-Q7mN9 branch from 7dddf8f to dfc55b8 Compare May 24, 2026 00:17
@jeduden jeduden merged commit 4809097 into main May 24, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants