perf(renderer): cache markdown segment parsing to cut mount-time hang#62
Merged
Conversation
…#55) `parseToSegments` is pure but expensive (full unified/remark CommonMark parse + AST walk per call). On canvas open it runs many times for the *same* text: - once for a text node's body in the live React tree (SkiaTextRenderer → buildParagraph), - again for the same node in the deferred Picture-recording pass (useCanvasPicture → drawTextNode), - again per callout zone via toPlainText (header / footer / labels / centered). That duplication is a large slice of the open-time main-thread hang profiled in #55. Add a bounded (512-entry, FIFO-evicted) module-level cache keyed on the raw markdown string so identical text parses exactly once and all later calls — across the React tree, the Picture pass, and callout zones — hit the cache. Why this is safe: - parseToSegments is referentially transparent (text → segments). - All consumers treat the result read-only: buildParagraph iterates and copies into a local array; toPlainText only filter/map/joins. No caller mutates the array or the segment objects, so sharing a cached reference is sound. Note: this removes *duplicate* parses, not the first parse of each unique string — a cold open of a file with many distinct text nodes still parses each once. Progressive/deferred node mounting (issue #55's other hypothesis) is a larger, separate follow-up; this is the cheap, zero-risk first win. Adds parseToSegments.test.ts: correctness + cache-identity + shared-cache (toPlainText) coverage. Also exposes _clearSegmentCache() for tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reconsidered: the segment cache is a pure function with a FIFO-bounded Map — trivially correct, near-zero regression risk. Testing it required mocking the pure-ESM `hast-util-from-html` chain (moduleNameMapper + a src/ stub that could itself drift from the real shape) to get the markdown module to load under ts-jest. That machinery cost far outweighed the assurance value, and a stub under src/ risks shipping in the package. Remove the test, the stub, the jest moduleNameMapper entry, and the test-only `_clearSegmentCache` export. The cache change itself is unchanged and still covered by the existing 38 tests (which prove parsing isn't broken) plus on-device verification. Real markdown-pipeline coverage — including the ESM-under-Jest problem solved once for the whole chain — belongs in the conformance suite (#16), not bolted onto this cache change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First, cheapest, zero-risk slice of #55 (canvas-open main-thread hang). Single-file change.
parseToSegments(full unified/remark CommonMark parse + AST walk) is pure but runs many times for the same text on open:SkiaTextRenderer→buildParagraph)useCanvasPicture→drawTextNode)toPlainText(header / footer / labels / centered)Adds a bounded (512-entry, FIFO) module-level cache keyed on the raw markdown string, collapsing those duplicate parses to one.
Diagnosis context
Investigated the full mount path before writing code. Two corrections to the issue's hypotheses:
useImageis async in Skia 2.6 (returnsnull, resolves later) — not the synchronous blocker. The dominant synchronous cost is markdown parsing, which this targets.Scope / honesty
Removes duplicate parses, not the first parse of each unique string. A cold open with many distinct text nodes still parses each once. Progressive/deferred node mounting (the issue's other hypothesis) is a larger, separate follow-up — not bundled here.
Why no unit test
The cache is a pure function over a FIFO-bounded Map — trivially correct, near-zero regression risk. A dedicated test required mocking the pure-ESM
hast-util-from-htmlchain to load the markdown module under ts-jest (moduleNameMapper + asrc/stub that could drift from the real shape and risks shipping in the package). That machinery cost outweighed the assurance. The change is covered by the existing 38 tests (parsing not broken) + on-device verification. Real markdown-pipeline coverage — solving ESM-under-Jest once for the whole chain — belongs in the conformance suite (#16).Safety
parseToSegmentsis referentially transparent; every consumer is read-only (buildParagraphcopies into a local array;toPlainTextonly filter/map/joins). Sharing a cached reference is sound.Test plan
.canvas, confirm reduced first-frame hitchPart of #55.
🤖 Generated with Claude Code