Performance: Optimize VAD segment energy tracking to zero-allocation running sum#240
Performance: Optimize VAD segment energy tracking to zero-allocation running sum#240
Conversation
Replaces `speechEnergies` and `silenceEnergies` arrays in `AudioSegmentProcessor` with a running `speechEnergySum` and `speechEnergyCount`. This eliminates per-chunk array allocations, `.push()` calls, and `.reduce()` operations in the hot loop, reducing CPU overhead and memory allocations.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
📝 WalkthroughWalkthroughThe pull request optimizes AudioSegmentProcessor energy averaging by replacing per-chunk energy arrays with running totals (sum and count). This eliminates continuous array allocations and reduces garbage collection pressure in the streaming hot path while preserving functional behavior. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoOptimize VAD energy tracking to zero-allocation running sum
WalkthroughsDescription• Replace array-based energy tracking with O(1) running sum • Eliminate per-chunk array allocations and .reduce() calls • Remove unused silenceEnergies array tracking • Reduce GC pressure in VAD hot loop Diagramflowchart LR
A["Array push per chunk"] -->|"Replace with"| B["Running sum += energy"]
C["reduce() on finalize"] -->|"Replace with"| D["sum / count division"]
E["speechEnergies array"] -->|"Remove"| F["speechEnergySum number"]
G["silenceEnergies array"] -->|"Remove"| H["Unused tracking"]
B --> I["Zero-allocation VAD loop"]
D --> I
File Changes1. src/lib/audio/AudioSegmentProcessor.ts
|
Code Review by Qodo🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)
Great, no issues found!Qodo reviewed your code and found no material issues that require reviewⓘ The new review experience is currently in Beta. Learn more |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider explicitly resetting
speechEnergySumandspeechEnergyCountwhen a speech segment ends (right after computingavgEnergy) so the state stays self-consistent and future changes to the state machine can’t accidentally reuse stale energy data. - Now that
silenceEnergiesand any silence energy accumulation are gone, it may be worth adding a short code comment nearstartSilenceor the silence-handling branches to document that silence energy is intentionally ignored and not used in any downstream statistics.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider explicitly resetting `speechEnergySum` and `speechEnergyCount` when a speech segment ends (right after computing `avgEnergy`) so the state stays self-consistent and future changes to the state machine can’t accidentally reuse stale energy data.
- Now that `silenceEnergies` and any silence energy accumulation are gone, it may be worth adding a short code comment near `startSilence` or the silence-handling branches to document that silence energy is intentionally ignored and not used in any downstream statistics.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/audio/AudioSegmentProcessor.ts`:
- Around line 252-253: The energy for a chunk seeded by a proactive split
restart is being double-counted: when startSpeech(...) seeds the split-boundary
chunk, the subsequent increments to this.state.speechEnergySum and
this.state.speechEnergyCount (the two spots reported) add the same chunk again
and distort avgEnergy/energyIntegral. Fix by tracking when a chunk was seeded by
startSpeech (e.g., add a boolean flag on this.state like
splitSeededOrSeededByStartSpeech) that startSpeech(...) sets for the seeded
chunk, then guard the increments to this.state.speechEnergySum and
this.state.speechEnergyCount so they only run when that flag is false; after the
first legitimate use clear the flag so future chunks are counted normally.
Ensure you apply the same guard in both increment sites (the two diffs
reported).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c3660bf9-6227-44ce-a433-2d242e612f1c
📒 Files selected for processing (2)
.jules/bolt.mdsrc/lib/audio/AudioSegmentProcessor.ts
| this.state.speechEnergySum += energy; | ||
| this.state.speechEnergyCount++; |
There was a problem hiding this comment.
Prevent split-boundary chunk from being counted twice.
After proactive split restart (Line 209), this same chunk is already seeded by startSpeech(...) and then re-added here, skewing avgEnergy/energyIntegral for split segments.
💡 Suggested fix
processAudioData(
chunk: Float32Array,
currentTime: number,
energy: number
): ProcessedSegment[] {
@@
- // --- Proactive Segment Splitting ---
+ let speechStartedThisChunk = false;
+
+ // --- Proactive Segment Splitting ---
if (this.state.inSpeech && this.state.speechStartTime !== null) {
@@
// Start new segment immediately
this.startSpeech(currentTime, energy);
+ speechStartedThisChunk = true;
}
}
@@
this.startSpeech(realStartTime, energy);
+ speechStartedThisChunk = true;
@@
if (silenceDuration < this.options.maxSilenceWithinSpeech) {
// Not yet enough silence to consider it a break
- this.state.speechEnergySum += energy;
- this.state.speechEnergyCount++;
+ if (!speechStartedThisChunk) {
+ this.state.speechEnergySum += energy;
+ this.state.speechEnergyCount++;
+ }
} else if (isConfirmedSilence) {
@@
} else {
// Continue in current state
- if (this.state.inSpeech) {
+ if (this.state.inSpeech && !speechStartedThisChunk) {
this.state.speechEnergySum += energy;
this.state.speechEnergyCount++;
} else {
// Silence, do nothing with energies
}
}Also applies to: 283-284
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/audio/AudioSegmentProcessor.ts` around lines 252 - 253, The energy
for a chunk seeded by a proactive split restart is being double-counted: when
startSpeech(...) seeds the split-boundary chunk, the subsequent increments to
this.state.speechEnergySum and this.state.speechEnergyCount (the two spots
reported) add the same chunk again and distort avgEnergy/energyIntegral. Fix by
tracking when a chunk was seeded by startSpeech (e.g., add a boolean flag on
this.state like splitSeededOrSeededByStartSpeech) that startSpeech(...) sets for
the seeded chunk, then guard the increments to this.state.speechEnergySum and
this.state.speechEnergyCount so they only run when that flag is false; after the
first legitimate use clear the flag so future chunks are counted normally.
Ensure you apply the same guard in both increment sites (the two diffs
reported).
There was a problem hiding this comment.
Code Review
This pull request optimizes the AudioSegmentProcessor by replacing array-based energy accumulation with a running sum and count. This change converts the average energy calculation from an O(n) operation with frequent memory allocations to an O(1) zero-allocation operation, reducing GC pressure in the high-frequency audio processing path. The documentation in .jules/bolt.md has been updated to reflect this optimization. I have no feedback to provide.
What changed
Replaced
speechEnergiesandsilenceEnergiesarrays inAudioSegmentProcessorwith a runningspeechEnergySumandspeechEnergyCount.Why it was needed (bottleneck evidence)
The
processAudioDatafunction runs very frequently (every ~80ms per 16kHz chunk). It was pushingenergyfloats into arrays (speechEnergiesandsilenceEnergies) and later calling.reduce()to calculate the average. This caused unnecessary object allocations and array iterations, contributing to main-thread GC churn during active speech.Impact
.push()per audio chunk..reduce()) to calculate the average energy when a segment finalizes.silenceEnergies).How to verify
npx vitest run src/lib/audio/AudioSegmentProcessor.test.tsto confirm VAD functionality remains identical.PR created automatically by Jules for task 10749343366007178863 started by @ysdede
Summary by Sourcery
Optimize VAD energy tracking in AudioSegmentProcessor to avoid per-chunk allocations and O(N) averaging in the hot path.
Enhancements:
Summary by CodeRabbit