Skip to content

perf(renderer): cache markdown segment parsing to cut mount-time hang#62

Merged
LeslieOA merged 2 commits into
developfrom
perf/markdown-parse-cache
May 30, 2026
Merged

perf(renderer): cache markdown segment parsing to cut mount-time hang#62
LeslieOA merged 2 commits into
developfrom
perf/markdown-parse-cache

Conversation

@LeslieOA

@LeslieOA LeslieOA commented May 30, 2026

Copy link
Copy Markdown
Member

Summary

First, cheapest, zero-risk slice of #55 (canvas-open main-thread hang). Single-file change.

parseToSegments (full unified/remark CommonMark parse + AST walk) is pure but runs many times for the same text on open:

  • text node body in the live React tree (SkiaTextRendererbuildParagraph)
  • the same node again in the deferred Picture-recording pass (useCanvasPicturedrawTextNode)
  • once per callout zone via toPlainText (header / footer / labels / centered)

Adds a bounded (512-entry, FIFO) module-level cache keyed on the raw markdown string, collapsing those duplicate parses to one.

Diagnosis context

Investigated the full mount path before writing code. Two corrections to the issue's hypotheses:

  • Culling can't help the open-hang. Open does fit-to-all-content, so every node is genuinely on-screen on frame 1 — viewport culling excludes nothing at that moment. (It still earns its keep when zoomed in + panning.)
  • useImage is async in Skia 2.6 (returns null, resolves later) — not the synchronous blocker. The dominant synchronous cost is markdown parsing, which this targets.

Scope / honesty

Removes duplicate parses, not the first parse of each unique string. A cold open with many distinct text nodes still parses each once. Progressive/deferred node mounting (the issue's other hypothesis) is a larger, separate follow-up — not bundled here.

Why no unit test

The cache is a pure function over a FIFO-bounded Map — trivially correct, near-zero regression risk. A dedicated test required mocking the pure-ESM hast-util-from-html chain to load the markdown module under ts-jest (moduleNameMapper + a src/ stub that could drift from the real shape and risks shipping in the package). That machinery cost outweighed the assurance. The change is covered by the existing 38 tests (parsing not broken) + on-device verification. Real markdown-pipeline coverage — solving ESM-under-Jest once for the whole chain — belongs in the conformance suite (#16).

Safety

parseToSegments is referentially transparent; every consumer is read-only (buildParagraph copies into a local array; toPlainText only filter/map/joins). Sharing a cached reference is sound.

Test plan

  • typecheck clean
  • lint 0 errors
  • 38/38 existing tests pass
  • On-device (mobile): open a text-heavy .canvas, confirm reduced first-frame hitch

Part of #55.

🤖 Generated with Claude Code

…#55)

`parseToSegments` is pure but expensive (full unified/remark CommonMark parse
+ AST walk per call). On canvas open it runs many times for the *same* text:

  - once for a text node's body in the live React tree (SkiaTextRenderer →
    buildParagraph),
  - again for the same node in the deferred Picture-recording pass
    (useCanvasPicture → drawTextNode),
  - again per callout zone via toPlainText (header / footer / labels /
    centered).

That duplication is a large slice of the open-time main-thread hang profiled
in #55. Add a bounded (512-entry, FIFO-evicted) module-level cache keyed on
the raw markdown string so identical text parses exactly once and all later
calls — across the React tree, the Picture pass, and callout zones — hit the
cache.

Why this is safe:
  - parseToSegments is referentially transparent (text → segments).
  - All consumers treat the result read-only: buildParagraph iterates and
    copies into a local array; toPlainText only filter/map/joins. No caller
    mutates the array or the segment objects, so sharing a cached reference is
    sound.

Note: this removes *duplicate* parses, not the first parse of each unique
string — a cold open of a file with many distinct text nodes still parses each
once. Progressive/deferred node mounting (issue #55's other hypothesis) is a
larger, separate follow-up; this is the cheap, zero-risk first win.

Adds parseToSegments.test.ts: correctness + cache-identity + shared-cache
(toPlainText) coverage. Also exposes _clearSegmentCache() for tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@LeslieOA LeslieOA marked this pull request as draft May 30, 2026 13:46
Reconsidered: the segment cache is a pure function with a FIFO-bounded Map —
trivially correct, near-zero regression risk. Testing it required mocking the
pure-ESM `hast-util-from-html` chain (moduleNameMapper + a src/ stub that could
itself drift from the real shape) to get the markdown module to load under
ts-jest. That machinery cost far outweighed the assurance value, and a stub
under src/ risks shipping in the package.

Remove the test, the stub, the jest moduleNameMapper entry, and the
test-only `_clearSegmentCache` export. The cache change itself is unchanged and
still covered by the existing 38 tests (which prove parsing isn't broken) plus
on-device verification.

Real markdown-pipeline coverage — including the ESM-under-Jest problem solved
once for the whole chain — belongs in the conformance suite (#16), not bolted
onto this cache change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@LeslieOA LeslieOA marked this pull request as ready for review May 30, 2026 15:32
@LeslieOA LeslieOA merged commit 57adf78 into develop May 30, 2026
1 check passed
@LeslieOA LeslieOA deleted the perf/markdown-parse-cache branch May 30, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant