Repro
On a brain with ~2000+ staged files (~/.gstack-brain-worktree/ + transcripts), /sync-gbrain --full reliably fails the memory stage:
gstack-gbrain-sync (full):
OK code registered + synced gstack-code-… (18.1s)
ERR memory staged 1989 pages → gbrain import (exit 143) after 2100.2s
OK brain-sync curated artifacts pushed (2.4s)
exit 143 = signal 15 (SIGTERM). Runtime 2100.2s = exactly 35 minutes.
Root cause
bin/gstack-gbrain-sync.ts:756 hardcodes:
const result = spawnSync("bun", ingestArgs, {
encoding: "utf-8",
timeout: 35 * 60 * 1000, // ← hardcoded; no escape hatch for big brains
env: buildGbrainEnv({ announce: false }),
});
(Same value also at line 602 for the code stage.)
What happens after the timeout
gbrain leaves its checkpoint on disk:
$ cat ~/.gbrain/import-checkpoint.json
{
"dir": "/Users/x/.gstack/.staging-ingest-42625-1779217370565",
"totalFiles": 1989,
"processedIndex": 1000, ← made it 50% through
"completedFiles": 1000,
"timestamp": "2026-05-19T19:30:05.008Z"
}
But the next /sync-gbrain doesn't resume from this checkpoint — the memory-ingest child cleans up the staging dir on SIGTERM, so the checkpoint references a dir that no longer exists. The user re-runs and the next bulk pass also gets killed at 35 min on a fresh set of files. Progress is real (1000 pages got embedded into the default source) but the verdict reads as ERR.
Two asks
1. Make the timeout configurable
const memoryTimeoutMs = parseInt(
process.env.GSTACK_SYNC_MEMORY_TIMEOUT_MS ?? `${35 * 60 * 1000}`,
10,
);
const codeTimeoutMs = parseInt(
process.env.GSTACK_SYNC_CODE_TIMEOUT_MS ?? `${35 * 60 * 1000}`,
10,
);
Default unchanged. Big-brain users set GSTACK_SYNC_MEMORY_TIMEOUT_MS=7200000 (2h) in their shell rc and the stage finishes instead of dying mid-import. Honest documentation update for the same: the embedded estimate "~25-35 min for ~11.7K transcripts = ~150ms/page synchronous" is wildly off in practice (see #1612 — actual is ~2.1s/file because gbrain isn't batching embeddings).
2. Resume from ~/.gbrain/import-checkpoint.json on next run
If the memory-ingest stage exits with SIGTERM, the staging dir should NOT be cleaned up by the SIGTERM handler — leave it on disk so the next run's gbrain import picks up at processedIndex+1. Alternatively, persist the staging dir path to ~/.gstack/.gbrain-sync-state.json so the next --full consults it first.
Net: a /sync-gbrain --full that SIGTERMs at the 35-min wall today loses 35 minutes of work. With resume, it loses zero — next run picks up where it left off.
Environment
- macOS 15.x, Apple Silicon (M-series Mac mini)
- gstack v1.40.0.0
- gbrain v0.33.1.0
- Brain size: ~188k markdown files in
~/brain/ (mostly emails from iCloud + Gmail backfill), ~2000 in ~/.gstack-brain-worktree/
- Engine: Supabase Session Pooler (postgres)
Adjacent issue
The 2.1s/file throughput points at a deeper gbrain-side issue (single-file vs batched OpenAI embedding calls). Filed separately at garrytan/gbrain.
Happy to PR the timeout-knob change if it helps.
Repro
On a brain with ~2000+ staged files (
~/.gstack-brain-worktree/+ transcripts),/sync-gbrain --fullreliably fails the memory stage:exit 143= signal 15 (SIGTERM). Runtime 2100.2s = exactly 35 minutes.Root cause
bin/gstack-gbrain-sync.ts:756hardcodes:(Same value also at line 602 for the code stage.)
What happens after the timeout
gbrain leaves its checkpoint on disk:
But the next
/sync-gbraindoesn't resume from this checkpoint — the memory-ingest child cleans up the staging dir on SIGTERM, so the checkpoint references a dir that no longer exists. The user re-runs and the next bulk pass also gets killed at 35 min on a fresh set of files. Progress is real (1000 pages got embedded into thedefaultsource) but the verdict reads asERR.Two asks
1. Make the timeout configurable
Default unchanged. Big-brain users set
GSTACK_SYNC_MEMORY_TIMEOUT_MS=7200000(2h) in their shell rc and the stage finishes instead of dying mid-import. Honest documentation update for the same: the embedded estimate "~25-35 min for ~11.7K transcripts = ~150ms/page synchronous" is wildly off in practice (see #1612 — actual is ~2.1s/file because gbrain isn't batching embeddings).2. Resume from
~/.gbrain/import-checkpoint.jsonon next runIf the memory-ingest stage exits with SIGTERM, the staging dir should NOT be cleaned up by the SIGTERM handler — leave it on disk so the next run's
gbrain importpicks up atprocessedIndex+1. Alternatively, persist the staging dir path to~/.gstack/.gbrain-sync-state.jsonso the next--fullconsults it first.Net: a
/sync-gbrain --fullthat SIGTERMs at the 35-min wall today loses 35 minutes of work. With resume, it loses zero — next run picks up where it left off.Environment
~/brain/(mostly emails from iCloud + Gmail backfill), ~2000 in~/.gstack-brain-worktree/Adjacent issue
The 2.1s/file throughput points at a deeper gbrain-side issue (single-file vs batched OpenAI embedding calls). Filed separately at garrytan/gbrain.
Happy to PR the timeout-knob change if it helps.