fix: raise phantom container memory limit from 2G to 8G by mcheemaa · Pull Request #52 · ghostwright/phantom

mcheemaa · 2026-04-12T21:47:11Z

Summary

Raises phantom container memory cgroup limit from 2 GiB to 8 GiB in both docker-compose.yaml and docker-compose.user.yaml
Bumps memory reservation from 256 MiB to 512 MiB to match the new steady-state baseline
Fixes SIGKILL cascade on Claude Code judge subprocesses under evolution load

Root cause

The 2 GiB cgroup ceiling could not hold peak LLM-judge concurrency. A post-session evolution cycle spawns up to five concurrent bun + cli.js subprocesses via runJudgeQuery (observation, regression, constitution, safety, consolidation). Each judge subprocess holds 300 to 500 MiB RSS, and they run on top of the main phantom process plus whatever agent query subprocess is in flight. Peak concurrent demand lands between 2.5 and 4 GiB, which crosses the 2 GiB ceiling and triggers the container's memcg OOM killer.

Phase 1's runJudgeQuery catches the resulting SIGKILL, the engine correctly fails closed on safety and constitution gates, and the other judges fall back to heuristic, so the main phantom process never crashes. But every LLM judge call after the first kill fails, which defeats the point of having judges enabled at all.

Evidence captured live

Observed on the wehshi Specter VM within 20 minutes of enabling LLM judges (post claude login + restart):

docker stats: phantom 2GiB / 2GiB 99.98% 178.54%
docker inspect phantom .HostConfig.Memory: 2147483648
journalctl -k: repeated Memory cgroup out of memory: Killed process <pid> (bun) events charged to the phantom container's oom_memcg, with anon-rss per killed subprocess ranging 99 MiB to 502 MiB
Phantom log stream: 20+ consecutive Claude Code process terminated by signal SIGKILL messages from observation, regression, constitution, safety, and consolidation judges
Host free -h: 30 GiB total, 27 GiB available at the time of the kills, so this was strictly a container cap, not a VM sizing problem

Sizing rationale

Hetzner CX53 (the Specter default) ships with 30 GiB RAM. With phantom at 8 GiB, qdrant at 4 GiB, and ollama at 4 GiB, total committed ceilings are 16 GiB, leaving 14 GiB of host headroom for the OS, Docker daemon, and any transient bursts. Actual steady-state phantom RSS is well under 1 GiB, so the 8 GiB cap is a generous upper bound rather than a sustained reservation.

Test plan

docker compose up -d phantom on wehshi against the new compose
docker inspect phantom --format '{{.HostConfig.Memory}}' reports 8589934592
docker stats steady-state shows phantom usage under the new ceiling
Trigger evolution cycles via Slack traffic, confirm no SIGKILL log lines from runJudgeQuery
journalctl -k --since "10 min ago" shows no new Memory cgroup out of memory events charged to the phantom memcg
Existing VMs (mcheema, cheeks) to be rolled forward via docker compose up -d phantom on each

Notes

Both compose files are updated because new Specter deploys use docker-compose.user.yaml (Docker Hub image), while source-built deploys use docker-compose.yaml. Keeping them consistent means every future deployment path inherits the new ceiling.

The 2 GiB cgroup ceiling OOM-killed Claude Code judge subprocesses under evolution load. A post-session evolution cycle spawns up to five concurrent bun + cli.js subprocesses via runJudgeQuery (observation, regression, constitution, safety, consolidation), each holding 300 to 500 MiB RSS, on top of the main phantom process and whatever agent query subprocess is in flight. Peak concurrent demand is 2.5 to 4 GiB, which crossed the 2 GiB ceiling and triggered SIGKILLs that phase 1's runJudgeQuery caught and reported as "Claude Code process terminated by signal SIGKILL", failing closed on safety and constitution gates and dropping to heuristics on observation and regression. Raising the limit to 8 GiB gives generous headroom for peak judge concurrency on a host with 30 GiB total (Hetzner CX53 default), leaving 14 GiB free after phantom (8G), qdrant (4G) and ollama (4G) caps. Reservation bumped from 256 MiB to 512 MiB to match the healthier steady-state baseline. Root cause observed on the wehshi VM: the SIGKILL cascade began within 20 minutes of enabling LLM judges, journalctl kernel log showed "Memory cgroup out of memory" events charged to the phantom container's memcg, and docker stats reported phantom pinned at 2 GiB / 2 GiB at 99.98 percent while the host sat at 27 GiB free.

mcheemaa merged commit d07b739 into main Apr 12, 2026
1 check passed

mcheemaa mentioned this pull request Apr 13, 2026

feat: playwright self-validation and general browser capability #53

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: raise phantom container memory limit from 2G to 8G#52

fix: raise phantom container memory limit from 2G to 8G#52
mcheemaa merged 1 commit intomainfrom
fix/raise-phantom-memory-limit-8g

mcheemaa commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mcheemaa commented Apr 12, 2026

Summary

Root cause

Evidence captured live

Sizing rationale

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant