fix(memory-core): stream embedding cache seed to avoid V8 heap OOM#1
Merged
lesaai merged 1 commit intoApr 24, 2026
Conversation
… heap OOM The embedding_cache table sync in MemoryManager.seedEmbeddingCache called .all() on SELECT * FROM embedding_cache, materializing the full result set into a JS array. embedding_cache rows contain serialized embedding text (~20 KB each on text-embedding-3-small) and can grow into hundreds of thousands of rows on long-running deployed databases. On a local 16 GB main.sqlite (435,136 rows, 8.68 GB of embedding text), the .all() call exceeds V8's ~4 GB default heap limit and aborts the gateway with: FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory ... node::sqlite::StatementSync::All ... Switching .all() -> .iterate() streams rows one at a time through the same BEGIN/COMMIT upsert transaction. Peak V8 heap stays bounded by a single row (~20 KB) plus the prepared statement, not the whole table. Also drops the empty-check on the materialized array's .length; an empty iterator commits a no-op transaction, which is cheap and preserves the observable behavior for empty caches. Scope note: this is the primary R2.A target (seedEmbeddingCache); a follow-up patch will address the secondary listChunks / keyword fallback .all() path in manager-search.ts. Validation: - pnpm tsgo:prod: green (core + extensions graphs) - pnpm test extensions/memory-core: 512 passed, 3 skipped, 0 failed
a315280
into
cc-mini/chat-completions-upstream-20260423
93 of 97 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Canary candidate. Do not npm-link to live Lēsa until canary passes.
Summary
Primary R2.A OOM fix.
seedEmbeddingCacheno longer materializes the fullembedding_cachetable with.all(); rows now stream one at a time through.iterate()inside the same BEGIN/COMMIT upsert transaction.Repro (before this patch)
On a long-running deployed
main.sqlite(observed: 16 GB, 435,136 embedding_cache rows × ~20 KB serialized text each = 8.68 GB),SELECT * FROM embedding_cachevia.all()materializes the entire result set into a JS array. V8's default ~4 GB heap limit is exceeded and the gateway aborts with:This reproduced locally on broad memory-heavy operations (e.g. recall/status compilation during a history review). LaunchAgent respawn keeps the service alive but it's a hard crash on every qualifying read.
Fix
.all()→.iterate()on the SELECTif (!rows.length) returngate (iterators don't have.length); empty iterator commits a no-op transaction, which is cheap and behaviorally equivalentValidation
pnpm tsgo:prod: green (core + extensions graphs)pnpm test extensions/memory-core: 512 passed, 3 skipped, 0 failedScope / not in this patch
.all()atextensions/memory-core/src/memory/manager-search.ts:246-252(listChunksused by the keyword-fallback ranker) is a bigger surgery. Caller needs the full candidate set for accurate cosine-similarity top-K ranking, so converting to streaming requires refactoring the caller to a bounded top-K heap. Follow-up PR once this canaries clean.Canary plan
seedEmbeddingCacheon gateway initAbort trap: 6, no PID change, no V8 heap-limit FATAL,openclaw doctorcleanlistChunkspath