security: enforce process limit to prevent fork bombs in sandbox#871
security: enforce process limit to prevent fork bombs in sandbox#871kagura-agent wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
…DIA#809) The sandbox user currently has ulimit -u unlimited, leaving no protection against fork bombs or process exhaustion attacks. A prompt-injected agent could DoS the host by spawning processes until the system runs out of PIDs. Defense in depth across three paths: 1. Dockerfile: set nproc hard limit (512) via /etc/security/limits.d/ so the sandbox user is capped regardless of how the process is started. 2. nemoclaw-start.sh (non-root): explicit ulimit -u 512 since limits.conf only applies when PAM is involved (login/su/sudo), not direct exec. 3. nemoclaw-start.sh (root, NEMOCLAW_CMD): wrap gosu sandbox with ulimit so ad-hoc commands also inherit the process cap. The gateway user is intentionally left uncapped — it needs to manage multiple child processes and should not be restricted. 512 processes is generous for normal agent workloads (node, python, shell pipelines) while preventing exponential fork spawning. Fixes NVIDIA#809
📝 WalkthroughWalkthroughThis PR adds a hard nproc limit for the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/nemoclaw-start.sh (1)
230-233: Consider adding a warning on ulimit failure for consistency.In the non-root path (line 192), a warning is printed if
ulimitfails. Here, the failure is silently ignored. While root should always succeed at setting ulimits, adding a warning would provide consistent observability across both execution paths.♻️ Optional: Add warning on failure
if [ ${`#NEMOCLAW_CMD`[@]} -gt 0 ]; then - exec gosu sandbox bash -c 'ulimit -u 512 2>/dev/null; exec "$@"' _ "${NEMOCLAW_CMD[@]}" + exec gosu sandbox bash -c 'ulimit -u 512 2>/dev/null || echo "[SECURITY WARNING] Could not set process limit (ulimit -u)" >&2; exec "$@"' _ "${NEMOCLAW_CMD[@]}" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/nemoclaw-start.sh` around lines 230 - 233, The exec line that runs commands as the sandbox user currently silences ulimit failures (the gosu bash -c 'ulimit -u 512 2>/dev/null; exec "$@"' _ "${NEMOCLAW_CMD[@]}")—add the same warning behavior used in the non-root path by checking ulimit's exit status and printing a warning if it fails; modify the bash -c script to run ulimit -u 512, capture its return code, and echo a warning to stderr (or use the existing warning function) when that return code is non-zero before continuing to exec "$@" so failures are observable while preserving the exec flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 230-233: The exec line that runs commands as the sandbox user
currently silences ulimit failures (the gosu bash -c 'ulimit -u 512 2>/dev/null;
exec "$@"' _ "${NEMOCLAW_CMD[@]}")—add the same warning behavior used in the
non-root path by checking ulimit's exit status and printing a warning if it
fails; modify the bash -c script to run ulimit -u 512, capture its return code,
and echo a warning to stderr (or use the existing warning function) when that
return code is non-zero before continuing to exec "$@" so failures are
observable while preserving the exec flow.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b8e98884-c4c6-4b93-bc38-b94be3a18037
📒 Files selected for processing (2)
Dockerfilescripts/nemoclaw-start.sh
Address CodeRabbit nitpick: the non-root path already warns on ulimit failure, but the root/gosu path silently ignored it.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/nemoclaw-start.sh (1)
187-193: Consider centralizing the nproc policy to avoid drift.The same limit and warning logic appears twice; a shared helper + single constant would keep behavior consistent and easier to maintain.
♻️ Suggested refactor
@@ OPENCLAW="$(command -v openclaw)" # Resolve once, use absolute path everywhere +NPROC_LIMIT=512 + +apply_nproc_limit() { + ulimit -u "${NPROC_LIMIT}" 2>/dev/null || \ + echo "[SECURITY WARNING] Could not set process limit (ulimit -u)" >&2 +} @@ - ulimit -u 512 2>/dev/null || echo "[SECURITY WARNING] Could not set process limit (ulimit -u)" + apply_nproc_limit @@ - exec gosu sandbox bash -c 'ulimit -u 512 2>/dev/null || echo "[SECURITY WARNING] Could not set process limit (ulimit -u)" >&2; exec "$@"' _ "${NEMOCLAW_CMD[@]}" + exec gosu sandbox bash -c "$(declare -f apply_nproc_limit); apply_nproc_limit; exec \"\$@\"" _ "${NEMOCLAW_CMD[@]}"Also applies to: 230-233
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/nemoclaw-start.sh` around lines 187 - 193, Extract the duplicated ulimit logic into a single constant and helper in the script: define a MAX_USER_PROCESSES variable (e.g. MAX_USER_PROCESSES=512) and a function (e.g. set_nproc_limit()) that runs ulimit -u "$MAX_USER_PROCESSES" 2>/dev/null || echo "[SECURITY WARNING] Could not set process limit (ulimit -u)"; then replace both inline occurrences of the ulimit command with a call to set_nproc_limit to centralize the policy and avoid drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 187-193: Extract the duplicated ulimit logic into a single
constant and helper in the script: define a MAX_USER_PROCESSES variable (e.g.
MAX_USER_PROCESSES=512) and a function (e.g. set_nproc_limit()) that runs ulimit
-u "$MAX_USER_PROCESSES" 2>/dev/null || echo "[SECURITY WARNING] Could not set
process limit (ulimit -u)"; then replace both inline occurrences of the ulimit
command with a call to set_nproc_limit to centralize the policy and avoid drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: aa01b982-96c3-404c-bbb7-21656f654354
📒 Files selected for processing (1)
scripts/nemoclaw-start.sh
|
Closing — #830 (merged) covers process limits with a more comprehensive approach (also removes gcc/netcat and drops capabilities). Thanks for the review! |
Summary
Fixes #809 — sandbox user has
ulimit -u unlimited, allowing fork bombs to exhaust host PID space.Changes
Defense in depth across three paths:
nprochard limit (512) via/etc/security/limits.d/ulimit -u 512limits.confonly applies with PAM (login/su/sudo), not direct execgosu sandboxwithulimitThe gateway user is intentionally left uncapped — it manages multiple child processes and should not be restricted.
Why 512?
Generous for normal agent workloads (node, python, shell pipelines) while blocking exponential fork spawning. For reference, a typical OpenClaw agent session uses ~20-50 processes.
Testing
Summary by CodeRabbit