Performance: Optimize LCS algorithm by ysdede · Pull Request #153 · ysdede/parakeet.js

ysdede · 2026-04-11T16:35:45Z

What changed
Unrolled the inner j loop in LCSPTFAMerger._lcsSubstring 8x and extracted X[i - 1] out of the inner loop into a cached local variable x_i.

Why it was needed
The Longest Common Subsequence (LCS) dynamic programming matrix is filled per-chunk when merging transcriptions. The manual nested loops over array elements introduce significant iteration and bounds-checking overhead in V8.

Impact
Benchmarks of the _lcsSubstring function show a significant performance increase (~1.8x to ~2x speedup). Filling a 100x100 matrix 50,000 times went from ~2197ms to ~1216ms in Node.js v22.

How to verify
Run the vitest test suite. The tests/long_audio_chunking.test.mjs test covers the LCSPTFAMerger algorithm usage. Tests should pass without any regressions.

PR created automatically by Jules for task 13776895976477544255 started by @ysdede

Summary by Sourcery

Optimize the LCSPTFAMerger LCS substring computation for better performance in long-audio transcript merging.

Enhancements:

Unroll the inner LCS dynamic programming loop in LCSPTFAMerger and cache per-iteration values to reduce overhead in tight typed array loops.

Documentation:

Document the LCS matrix loop-unrolling performance findings and guidance in the internal .jules/bolt.md playbook.

Summary by CodeRabbit

Performance
- Optimized longest common subsequence computation with an 8x loop unrolling enhancement, delivering approximately 80% speedup in execution performance.

Unrolls the inner dynamic programming loop in `_lcsSubstring` by 8x and caches the outer loop variable `X[i - 1]` to reduce overhead and improve execution speed during streaming transcription chunk merges.

google-labs-jules · 2026-04-11T16:35:46Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

coderabbitai · 2026-04-11T16:35:59Z

📝 Walkthrough

Walkthrough

This pull request optimizes the Longest Common Subsequence (LCS) algorithm in LCSPTFAMerger by implementing 8-way loop unrolling in the _lcsSubstring function and caching array access, reporting an ~80% V8 speedup. Documentation is updated to log the optimization technique.

Changes

Cohort / File(s)	Summary
Documentation Update `.jules/bolt.md`	Added dated entry documenting the loop unrolling optimization and its ~80% speedup impact on V8, with recommended action for applying similar patterns to dynamic-programming hot paths.
LCS Algorithm Optimization `src/parakeet.js`	Refactored `_lcsSubstring` inner loop with 8-way unrolling (`j += 8`), cached `X[i-1]` into `x_i` variable, and added tail loop for remaining iterations while preserving all DP state updates and max-length tracking logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

type/performance, status/ready

Poem

🐰 Eight hops at once, through sequences we race,
Caching and unrolling, quickening the pace,
Where matrices align in unison flow,
Eighty percent faster—let performance glow!
A rabbit's delight: optimization so bright! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: optimizing the LCS algorithm through loop unrolling in LCSPTFAMerger.
Description check	✅ Passed	The description covers the key sections (What changed, Why needed, Impact, Verification), but the PR template checklist is not completed with selections or detailed risk assessment.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bolt-performance-lcs-unroll-13776895976477544255

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've left some high level feedback:

The 8x-unrolled inner loop is extremely repetitive and hard to audit; consider extracting the repeated temp/LCS[...]/maxLen update pattern into a small helper or a local inline function that takes an offset and index so that changing the unroll factor or fixing a bug doesn’t require editing eight nearly identical blocks.
It would help future maintainers if you added a brief comment above the unrolled loop explaining the DP invariant (prev/LCS[j] relationship) and the j <= n - 7 / tail-loop boundary assumptions, so that someone tweaking the unroll factor or indices can verify correctness more easily.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The 8x-unrolled inner loop is extremely repetitive and hard to audit; consider extracting the repeated `temp`/`LCS[...]`/`maxLen` update pattern into a small helper or a local inline function that takes an offset and index so that changing the unroll factor or fixing a bug doesn’t require editing eight nearly identical blocks.
- It would help future maintainers if you added a brief comment above the unrolled loop explaining the DP invariant (`prev`/`LCS[j]` relationship) and the `j <= n - 7` / tail-loop boundary assumptions, so that someone tweaking the unroll factor or indices can verify correctness more easily.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

🧹 Nitpick comments (1)

src/parakeet.js (1)

1954-2059: Add boundary-focused regression tests for the unrolled LCS path.

The manual 8-way loop unroll in _lcsSubstring handles critical token merging logic and should have explicit tests for n < 8, n === 8, and n % 8 !== 0 cases against a reference implementation to prevent off-by-one regressions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/parakeet.js` around lines 1954 - 2059, The unrolled 8-way loop in
_lcsSubstring lacks regression tests for boundary sizes and can regress on
off-by-one cases; add unit tests that compare _lcsSubstring's results to a
simple reference implementation for cases n < 8, n === 8, and n % 8 !== 0.
Implement tests that (1) generate varied input pairs including empty, short
(lengths 1..7), exactly 8, and lengths like 9, 15, 17, (2) include
repeating/merged-token patterns to exercise token-merging behavior, and (3)
assert equality of returned max length and substring positions (use the same
return shape as _lcsSubstring). Put tests next to other string/sequence tests
and name them clearly (e.g. "_lcsSubstring boundary sizes vs reference") so
future regressions are caught.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/parakeet.js`:
- Around line 1954-2059: The unrolled 8-way loop in _lcsSubstring lacks
regression tests for boundary sizes and can regress on off-by-one cases; add
unit tests that compare _lcsSubstring's results to a simple reference
implementation for cases n < 8, n === 8, and n % 8 !== 0. Implement tests that
(1) generate varied input pairs including empty, short (lengths 1..7), exactly
8, and lengths like 9, 15, 17, (2) include repeating/merged-token patterns to
exercise token-merging behavior, and (3) assert equality of returned max length
and substring positions (use the same return shape as _lcsSubstring). Put tests
next to other string/sequence tests and name them clearly (e.g. "_lcsSubstring
boundary sizes vs reference") so future regressions are caught.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 945cf63c-26da-43f9-8565-df4aed030ce4

📥 Commits

Reviewing files that changed from the base of the PR and between 262e1f9 and 7ed8795.

📒 Files selected for processing (2)

.jules/bolt.md
src/parakeet.js

gemini-code-assist

Code Review

This pull request optimizes the Longest Common Subsequence calculation in src/parakeet.js by implementing 8x manual loop unrolling and caching array accesses to improve performance. It also documents this optimization in .jules/bolt.md and removes an incorrect comment regarding the traversal direction. I have no feedback to provide.

Performance: Optimize LCS algorithm in LCSPTFAMerger

7ed8795

Unrolls the inner dynamic programming loop in `_lcsSubstring` by 8x and caches the outer loop variable `X[i - 1]` to reduce overhead and improve execution speed during streaming transcription chunk merges.

sourcery-ai bot reviewed Apr 11, 2026

View reviewed changes

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Optimize LCS algorithm#153

Performance: Optimize LCS algorithm#153
ysdede wants to merge 1 commit intomasterfrom
bolt-performance-lcs-unroll-13776895976477544255

ysdede commented Apr 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

google-labs-jules bot commented Apr 11, 2026

Uh oh!

coderabbitai bot commented Apr 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Poem

Uh oh!

sourcery-ai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ysdede commented Apr 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

google-labs-jules bot commented Apr 11, 2026

Uh oh!

coderabbitai bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Poem

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ysdede commented Apr 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 11, 2026 •

edited

Loading