Skip to content

fix(memory-core): stream embedding cache seed to avoid V8 heap OOM#1

Merged
lesaai merged 1 commit into
cc-mini/chat-completions-upstream-20260423from
cc-mini/memory-core-bounded-seed
Apr 24, 2026
Merged

fix(memory-core): stream embedding cache seed to avoid V8 heap OOM#1
lesaai merged 1 commit into
cc-mini/chat-completions-upstream-20260423from
cc-mini/memory-core-bounded-seed

Conversation

@lesaai
Copy link
Copy Markdown
Member

@lesaai lesaai commented Apr 24, 2026

Canary candidate. Do not npm-link to live Lēsa until canary passes.

Summary

Primary R2.A OOM fix. seedEmbeddingCache no longer materializes the full embedding_cache table with .all(); rows now stream one at a time through .iterate() inside the same BEGIN/COMMIT upsert transaction.

Repro (before this patch)

On a long-running deployed main.sqlite (observed: 16 GB, 435,136 embedding_cache rows × ~20 KB serialized text each = 8.68 GB), SELECT * FROM embedding_cache via .all() materializes the entire result set into a JS array. V8's default ~4 GB heap limit is exceeded and the gateway aborts with:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
... node::sqlite::StatementSync::All ...

This reproduced locally on broad memory-heavy operations (e.g. recall/status compilation during a history review). LaunchAgent respawn keeps the service alive but it's a hard crash on every qualifying read.

Fix

  1. .all().iterate() on the SELECT
  2. Drops the if (!rows.length) return gate (iterators don't have .length); empty iterator commits a no-op transaction, which is cheap and behaviorally equivalent
  3. Peak V8 heap stays bounded by a single row (~20 KB) + prepared statements instead of the whole table

Validation

  • pnpm tsgo:prod: green (core + extensions graphs)
  • pnpm test extensions/memory-core: 512 passed, 3 skipped, 0 failed

Scope / not in this patch

  • Secondary .all() at extensions/memory-core/src/memory/manager-search.ts:246-252 (listChunks used by the keyword-fallback ranker) is a bigger surgery. Caller needs the full candidate set for accurate cosine-similarity top-K ranking, so converting to streaming requires refactoring the caller to a bounded top-K heap. Follow-up PR once this canaries clean.
  • No upstream PR yet; this lives on the WIP fork only.

Canary plan

  1. Canary-install this branch (not live Lēsa)
  2. Repro target: Day 63-style broad memory review, or any operation triggering seedEmbeddingCache on gateway init
  3. Pass criteria: no Abort trap: 6, no PID change, no V8 heap-limit FATAL, openclaw doctor clean
  4. If clean, promote; then tackle the secondary listChunks path

… heap OOM

The embedding_cache table sync in MemoryManager.seedEmbeddingCache called
.all() on SELECT * FROM embedding_cache, materializing the full result set
into a JS array. embedding_cache rows contain serialized embedding text
(~20 KB each on text-embedding-3-small) and can grow into hundreds of
thousands of rows on long-running deployed databases. On a local 16 GB
main.sqlite (435,136 rows, 8.68 GB of embedding text), the .all() call
exceeds V8's ~4 GB default heap limit and aborts the gateway with:

  FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap
  out of memory
  ... node::sqlite::StatementSync::All ...

Switching .all() -> .iterate() streams rows one at a time through the
same BEGIN/COMMIT upsert transaction. Peak V8 heap stays bounded by a
single row (~20 KB) plus the prepared statement, not the whole table.

Also drops the empty-check on the materialized array's .length; an
empty iterator commits a no-op transaction, which is cheap and
preserves the observable behavior for empty caches.

Scope note: this is the primary R2.A target (seedEmbeddingCache); a
follow-up patch will address the secondary listChunks / keyword fallback
.all() path in manager-search.ts.

Validation:
- pnpm tsgo:prod: green (core + extensions graphs)
- pnpm test extensions/memory-core: 512 passed, 3 skipped, 0 failed
@lesaai lesaai merged commit a315280 into cc-mini/chat-completions-upstream-20260423 Apr 24, 2026
93 of 97 checks passed
@lesaai lesaai deleted the cc-mini/memory-core-bounded-seed branch April 24, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant