fix(jailer): replace hardcoded FD limits with dynamic closure strategies by lilongen · Pull Request #406 · boxlite-ai/boxlite

lilongen · 2026-03-26T03:22:44Z

Summary

Replace hardcoded FD upper bounds (1024 on Linux, 4096 on macOS) with dynamic limits from getrlimit(RLIMIT_NOFILE), fixing FD leakage on systems with raised ulimit -n
Add /proc/self/fd enumeration via raw getdents64 syscall as an efficient middle strategy on Linux (zero heap allocation, async-signal-safe)
Restructure close_fds_from() into a 3-strategy cascade: close_range → /proc/self/fd → brute-force with dynamic limit

Problem

The FD cleanup in the pre_exec hook used hardcoded upper bounds:

Linux: for fd in first_fd..1024
macOS: for fd in first_fd..4096

On production systems with ulimit -n 65536 or higher, any FDs above these limits leaked into the jailed process, potentially exposing credentials, database connections, or network sockets inherited from the parent.

Solution

Linux — 3-strategy cascade

Strategy	Condition	Complexity	Description
1. `close_range`	Linux 5.9+	O(1)	Kernel closes all FDs in range (already existed)
2. `/proc/self/fd`	`/proc` mounted	O(open FDs)	Enumerate via raw `getdents64`, close only open FDs
3. Brute-force	Fallback	O(ulimit)	Close `first_fd..getrlimit(RLIMIT_NOFILE)`

macOS

Brute-force close with dynamic limit from getrlimit(RLIMIT_NOFILE) (replaces hardcoded 4096).

Key constraints

All operations are async-signal-safe (required for pre_exec context):

No heap allocation — stack buffer for getdents64, byte literals for paths
No io::Error, String, Vec, CString — raw syscalls only
getrlimit is POSIX async-signal-safe
RLIM_INFINITY handled by capping to 1,048,576 (Linux nr_open default)

Test plan

test_close_high_numbered_fd — creates FD 2000 (above old limits), verifies closure
test_get_max_fd_returns_positive — validates dynamic limit query
test_parse_fd_from_name (Linux) — covers valid FDs, ./.., empty, i32 overflow
All 4 existing tests continue to pass
cargo clippy -D warnings — zero warnings on both platforms
cargo fmt --check — clean
Tested on macOS ARM64 (6/6 passed) and Linux ARM64 via Lima (7/7 passed)

🤖 Generated with Claude Code

Copilot

Pull request overview

Updates the jailer’s pre-exec FD cleanup to avoid leaking inherited file descriptors on hosts with high ulimit -n, while maintaining the async-signal-safe constraint required by CommandExt::pre_exec.

Changes:

Replace hardcoded FD upper bounds with a dynamic limit from getrlimit(RLIMIT_NOFILE).
Add a Linux-only middle strategy that enumerates /proc/self/fd via raw getdents64 to close only actually-open FDs.
Add tests covering high-numbered FD closure and Linux FD-name parsing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-26T03:26:11Z

+    loop {
+        let nread = unsafe {
+            libc::syscall(
+                libc::SYS_getdents64,
+                dir_fd,
+                buf.as_mut_ptr(),
+                buf.len() as libc::c_uint,
+            )
+        };
+
+        if nread <= 0 {
+            break;
+        }
+
+        let mut offset = 0usize;
+        while offset < nread as usize {
+            // SAFETY: getdents64 returns packed linux_dirent64 structs.
+            // d_reclen is at byte offset 16. Use read_unaligned because the
+            // buffer is a byte array and the u16 field may not be 2-byte aligned.
+            let d_reclen =
+                unsafe { buf.as_ptr().add(offset + 16).cast::<u16>().read_unaligned() } as usize;
+
+            if d_reclen == 0 || offset + d_reclen > nread as usize {
+                break;
+            }


close_fds_via_proc currently returns true even if getdents64 fails (e.g., syscall returns -1 for EINTR/EBADF). That would skip the brute-force fallback and can leave FDs open. Consider treating nread < 0 (and also malformed d_reclen conditions) as a failure: close dir_fd and return false so brute_force_close_fds() runs.

Copilot · 2026-03-26T03:26:12Z

+    if result == 0 && rlim.rlim_cur > 0 {
+        if rlim.rlim_cur < i32::MAX as u64 {
+            rlim.rlim_cur as i32
+        } else {
+            // rlim_cur is RLIM_INFINITY or very large; cap to a safe maximum.
+            // 1048576 (2^20) matches Linux's default /proc/sys/fs/nr_open.
+            1_048_576
+        }


get_max_fd() uses if rlim.rlim_cur < i32::MAX as u64 which incorrectly treats an exact i32::MAX soft limit as “too large” and caps it to 1,048,576. This looks like an off-by-one comparison; use <= if the intent is “fits in i32”.

Copilot · 2026-03-26T03:26:12Z

+        // getrlimit failed or returned 0; safe default per POSIX OPEN_MAX
+        1024


If getrlimit(RLIMIT_NOFILE) fails (or returns 0), get_max_fd() falls back to 1024, which can reintroduce the original FD-leak problem on hosts with a higher limit (since brute-force will stop at 1024). Consider a fallback that still errs on the side of closing more FDs (e.g., a higher cap like the existing 1,048,576) to preserve the security invariant even when getrlimit is blocked or fails.

Suggested change

// getrlimit failed or returned 0; safe default per POSIX OPEN_MAX

1024

// getrlimit failed or returned 0; use the same conservative cap to

// ensure we still attempt to close a wide range of FDs.

1_048_576

DorianZheng · 2026-04-17T07:25:50Z

Same getdents64 bug as #407: in fd.rs:71-118, nread < 0 is treated as success so close_fds_via_proc() returns true on failure and the getrlimit-based fallback never kicks in. Fix is to return false when getdents64 fails. Would also be good to add a test that exercises this error path.

The FD cleanup in pre_exec used hardcoded upper bounds (1024 on Linux, 4096 on macOS). On systems with raised ulimit -n, FDs above these limits leaked into jailed processes, potentially exposing credentials, database connections, or network sockets. Replace with a 3-strategy cascade (Linux): 1. close_range(first_fd, ~0U, 0) — O(1), Linux 5.9+ 2. /proc/self/fd enumeration via raw getdents64 — no heap allocation 3. Brute-force close with dynamic limit from getrlimit(RLIMIT_NOFILE) macOS uses brute-force with dynamic getrlimit limit (replaces hardcoded 4096). All operations remain async-signal-safe for the pre_exec context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address PR boxlite-ai#406 review feedback from Copilot and DorianZheng: - Return false from close_fds_via_proc when getdents64 fails (nread < 0) so the brute-force fallback runs instead of silently skipping FDs - Fix off-by-one: use <= i32::MAX instead of < i32::MAX in get_max_fd() - Raise getrlimit failure fallback from 1024 to 1_048_576 to preserve the security invariant on hosts where getrlimit is blocked - Add test for brute-force fallback path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

lilongen · 2026-04-21T06:07:54Z

Addressed all review feedback in commit f90c941:

@copilot + @DorianZheng — close_fds_via_proc returns false on getdents64 failure

Split nread <= 0 into two checks: nread < 0 (error → return false, triggers brute-force fallback) and nread == 0 (normal EOF)

@copilot — Off-by-one in get_max_fd()

Changed < i32::MAX to <= i32::MAX so exact i32::MAX soft limit is handled correctly

@copilot — Fallback too low when getrlimit fails

Raised from 1024 to 1_048_576 to preserve security invariant

@DorianZheng — Added test for fallback path

New test_brute_force_fallback_closes_fds exercises brute_force_close_fds() directly

All tests pass, clippy clean.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-21T06:11:51Z

+        // Raise the soft FD limit so we can dup2 to a high FD number.
+        // Some systems default to ulimit -n 1024, which would prevent dup2 to FD 2000.
+        let new_limit = libc::rlimit {
+            rlim_cur: 4096,
+            rlim_max: 4096,
+        };
+        // SAFETY: setrlimit is async-signal-safe. Raising soft limit within hard limit is allowed.
+        if unsafe { libc::setrlimit(libc::RLIMIT_NOFILE, &new_limit) } != 0 {
+            // Can't raise limit (hard limit too low) — skip test gracefully
+            return 0;
+        }
+
+        // dup2 stdout to a high FD number that exceeds the old hardcoded limits.
+        // Use 2000 on all platforms — above old Linux limit (1024) and reasonable.
+        let high_fd: i32 = 2000;
+        let result = unsafe { libc::dup2(STDOUT_FD, high_fd) };
+        if result != high_fd {
+            // dup2 failed — unexpected after raising ulimit
+            return 1;
+        }
+
+        // Verify the high FD is open
+        if unsafe { libc::fcntl(high_fd, libc::F_GETFD) } < 0 {
+            return 2;
+        }
+
+        // Close all inherited FDs
+        if close_inherited_fds_raw().is_err() {
+            return 3;
+        }
+
+        // The high FD should now be closed
+        if unsafe { libc::fcntl(high_fd, libc::F_GETFD) } != -1 {
+            return 4;


child_close_high_numbered_fd returns 0 (passing the test) when setrlimit(RLIMIT_NOFILE, 4096) fails, which can make the test vacuously succeed on systems where the hard limit is low and the regression would otherwise go untested. Consider deriving a high_fd from the current hard limit (e.g., raise rlim_cur up to rlim_max, then pick high_fd = min(2000, rlim_cur - 1) and only skip if rlim_cur can’t exceed the old hardcoded bound). This keeps the test meaningful across more CI environments.

Suggested change

// Raise the soft FD limit so we can dup2 to a high FD number.

// Some systems default to ulimit -n 1024, which would prevent dup2 to FD 2000.

let new_limit = libc::rlimit {

rlim_cur: 4096,

rlim_max: 4096,

};

// SAFETY: setrlimit is async-signal-safe. Raising soft limit within hard limit is allowed.

if unsafe { libc::setrlimit(libc::RLIMIT_NOFILE, &new_limit) } != 0 {

// Can't raise limit (hard limit too low) — skip test gracefully

return 0;

}

// dup2 stdout to a high FD number that exceeds the old hardcoded limits.

// Use 2000 on all platforms — above old Linux limit (1024) and reasonable.

let high_fd: i32 = 2000;

let result = unsafe { libc::dup2(STDOUT_FD, high_fd) };

if result != high_fd {

// dup2 failed — unexpected after raising ulimit

return 1;

}

// Verify the high FD is open

if unsafe { libc::fcntl(high_fd, libc::F_GETFD) } < 0 {

return 2;

}

// Close all inherited FDs

if close_inherited_fds_raw().is_err() {

return 3;

}

// The high FD should now be closed

if unsafe { libc::fcntl(high_fd, libc::F_GETFD) } != -1 {

return 4;

const OLD_HARDCODED_LOWER_BOUND: i32 = 1024;

const TARGET_HIGH_FD: i32 = 2000;

let mut rlim = libc::rlimit {

rlim_cur: 0,

rlim_max: 0,

};

// SAFETY: getrlimit is async-signal-safe and rlim points to valid stack storage.

if unsafe { libc::getrlimit(libc::RLIMIT_NOFILE, &mut rlim) } != 0 {

return 1;

}

// Raise the soft limit as far as the current hard limit allows so we can

// test a high-numbered FD even on systems where 4096 is not permitted.

if rlim.rlim_cur < rlim.rlim_max {

let raised_limit = libc::rlimit {

rlim_cur: rlim.rlim_max,

rlim_max: rlim.rlim_max,

};

// SAFETY: setrlimit is async-signal-safe. Raising rlim_cur up to rlim_max is valid.

if unsafe { libc::setrlimit(libc::RLIMIT_NOFILE, &raised_limit) } == 0 {

rlim = raised_limit;

} else {

// Re-read the effective limit in case another policy or prior state changed it.

// SAFETY: getrlimit is async-signal-safe and rlim points to valid stack storage.

if unsafe { libc::getrlimit(libc::RLIMIT_NOFILE, &mut rlim) } != 0 {

return 2;

}

}

}

let effective_limit = if rlim.rlim_cur > i32::MAX as libc::rlim_t {

i32::MAX

} else {

rlim.rlim_cur as i32

};

if effective_limit <= OLD_HARDCODED_LOWER_BOUND {

// Can't exercise the regression meaningfully on this system.

return 0;

}

let high_fd = std::cmp::min(TARGET_HIGH_FD, effective_limit - 1);

// dup2 stdout to a high FD number that exceeds the old hardcoded limits.

let result = unsafe { libc::dup2(STDOUT_FD, high_fd) };

if result != high_fd {

return 3;

}

// Verify the high FD is open

if unsafe { libc::fcntl(high_fd, libc::F_GETFD) } < 0 {

return 4;

}

// Close all inherited FDs

if close_inherited_fds_raw().is_err() {

return 5;

}

// The high FD should now be closed

if unsafe { libc::fcntl(high_fd, libc::F_GETFD) } != -1 {

return 6;

Copilot · 2026-04-21T06:11:52Z

+            if d_reclen == 0 || offset + d_reclen > nread as usize {
+                break;
+            }


In close_fds_via_proc, if d_reclen is 0 or would run past nread, the code breaks out of the inner loop but still leaves success as true and continues the outer loop. That can cause the function to return true (skipping the brute-force fallback) even though directory parsing was incomplete, potentially leaving some FDs unclosed. Consider treating this as a hard failure: set success = false and break out of the outer read loop when the record looks malformed/inconsistent, so callers reliably fall back to the brute-force strategy.

Copilot · 2026-04-21T06:11:52Z

+            // getdents64 failed (e.g., EINTR, EBADF); signal failure so
+            // the brute-force fallback runs instead.


getdents64 can return -1 with EINTR; currently that is treated as a failure and triggers the brute-force fallback. That keeps correctness, but can cause a large (and avoidable) performance hit in pre_exec if an interrupt occurs. Consider checking errno and retrying the getdents64 call on EINTR, only falling back on other errors.

Suggested change

// getdents64 failed (e.g., EINTR, EBADF); signal failure so

// the brute-force fallback runs instead.

// SAFETY: errno is thread-local and may be read immediately after

// the failing syscall to distinguish EINTR from real failures.

let errno = unsafe { *libc::__errno_location() };

if errno == libc::EINTR {

continue;

}

// getdents64 failed for a non-retryable reason (e.g., EBADF);

// signal failure so the brute-force fallback runs instead.

Copilot AI review requested due to automatic review settings March 26, 2026 03:22

Copilot started reviewing on behalf of lilongen March 26, 2026 03:23 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

lilongen force-pushed the fix/jailer-dynamic-fd-closure branch from 28e2ce4 to 5bf706b Compare April 2, 2026 14:37

lile and others added 2 commits April 21, 2026 14:04

Copilot AI review requested due to automatic review settings April 21, 2026 06:06

lilongen force-pushed the fix/jailer-dynamic-fd-closure branch from 5bf706b to f90c941 Compare April 21, 2026 06:06

Copilot started reviewing on behalf of lilongen April 21, 2026 06:07 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(jailer): replace hardcoded FD limits with dynamic closure strategies#406

fix(jailer): replace hardcoded FD limits with dynamic closure strategies#406
lilongen wants to merge 2 commits intoboxlite-ai:mainfrom
lilongen:fix/jailer-dynamic-fd-closure

lilongen commented Mar 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

DorianZheng commented Apr 17, 2026

Uh oh!

lilongen commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// getrlimit failed or returned 0; safe default per POSIX OPEN_MAX
		1024

		// getdents64 failed (e.g., EINTR, EBADF); signal failure so
		// the brute-force fallback runs instead.

-            // getdents64 failed (e.g., EINTR, EBADF); signal failure so
-            // the brute-force fallback runs instead.
+            // SAFETY: errno is thread-local and may be read immediately after
+            // the failing syscall to distinguish EINTR from real failures.
+            let errno = unsafe { *libc::__errno_location() };
+            if errno == libc::EINTR {
+                continue;
+            }
+            // getdents64 failed for a non-retryable reason (e.g., EBADF);
+            // signal failure so the brute-force fallback runs instead.

Conversation

lilongen commented Mar 26, 2026

Summary

Problem

Solution

Linux — 3-strategy cascade

macOS

Key constraints

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

DorianZheng commented Apr 17, 2026

Uh oh!

lilongen commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants