Skip to content

Add safe daemon Unix socket listen#7

Merged
mariusvniekerk merged 6 commits into
mainfrom
daemon-safe-unix-listen
Jun 2, 2026
Merged

Add safe daemon Unix socket listen#7
mariusvniekerk merged 6 commits into
mainfrom
daemon-safe-unix-listen

Conversation

@mariusvniekerk
Copy link
Copy Markdown
Contributor

Summary

  • add daemon.Listen(ctx, ep, opts) for daemon server startup
  • serialize Unix stale socket probing/removal and bind under an inter-process listen lock
  • reject live Unix sockets and non-socket paths instead of removing them
  • add RuntimeStore.ListenLockPath() so server listen locking stays separate from Manager.Ensure auto-start locking
  • cover stale Unix socket cleanup, live/non-socket refusal, and concurrent startup behavior

Kata API

Kata daemon server startup should use:

listener, err := daemon.Listen(ctx, endpoint, daemon.ListenOptions{Store: runtimeStore})

If kata does not have a RuntimeStore in that startup path, pass a stable explicit LockPath or rely on the default lock derived from the Unix socket path.

Validation

  • go test ./daemon
  • go test ./...

@mariusvniekerk mariusvniekerk force-pushed the daemon-safe-unix-listen branch from ea65f75 to e4df953 Compare June 2, 2026 02:32
Daemon servers need a kit-owned startup primitive so applications do not reimplement stale Unix socket cleanup around net.Listen. Without serialization, concurrent starts can both classify a socket as stale and one process can remove the other process's newly bound socket.

This adds a daemon.Listen helper that holds an inter-process listen lock across stale socket probing, removal, and bind. The listen lock is separate from the Manager auto-start lock so detached child daemons can bind while the parent waits for discovery.

Validation: go test ./daemon; go test ./...

🤖 Generated with [OpenAI Codex](https://openai.com/codex)
Co-authored-by: OpenAI Codex <noreply@openai.com>
@mariusvniekerk mariusvniekerk force-pushed the daemon-safe-unix-listen branch from e4df953 to a349861 Compare June 2, 2026 02:34
@roborev-ci
Copy link
Copy Markdown

roborev-ci Bot commented Jun 2, 2026

roborev: Combined Review (a349861)

Synthesis unavailable. Showing individual review outputs.

codex — default (done)

Review Findings

  • Severity: Medium
  • Location: daemon/lock.go:29
  • Problem: Relative lock paths are treated as directories to harden, so LockPath: "daemon.lock" or a relative Unix endpoint that derives daemon.sock.lock makes ensurePrivateRuntimeDir(".") potentially chmod the process working directory to 0700.
  • Fix: Reject relative lock paths for this helper, or normalize them to an intended private runtime directory before calling ensurePrivateRuntimeDir.

Summary

The change adds a locked Unix-socket listen helper with stale socket cleanup and refactors daemon start locking through the same lock helper.


codex — security (done)

No issues found.

Daemon lock acquisition now runs safefile private-directory hardening before taking the file lock. A bare relative lock path would make the lock directory resolve to the process working directory, so the hardening step could unexpectedly chmod the caller's cwd.

Rejecting relative lock paths preserves the safefile protection without letting arbitrary process state become the lock directory. The regression covers both explicit relative LockPath input and relative Unix endpoint-derived lock paths.

Validation: go test ./daemon; go test ./...

🤖 Generated with [OpenAI Codex](https://openai.com/codex)
Co-authored-by: OpenAI Codex <noreply@openai.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci Bot commented Jun 2, 2026

roborev: Fail

Medium-risk issues found; no Critical or High findings were reported.

Medium

  • daemon/listen.go:61
    ListenOptions.Store.Dir can move the listen lock into the private store path while leaving the Unix socket itself in an unvalidated shared directory such as /tmp. This allows binding daemon API sockets in locations other local users may access.
    Fix: Always validate the Unix socket parent directory before stale probing/removal and bind, independent of Store or LockPath, or explicitly create/chmod the socket with restrictive permissions and reject shared/world-writable socket directories.

  • daemon/listen.go:80
    Relative Unix socket paths are still accepted when the caller provides an absolute LockPath or Store, so stale cleanup and binding can happen relative to the process working directory, bypassing the absolute-path invariant enforced by ParseEndpoint.
    Fix: Reject non-absolute ep.Address for Unix endpoints before acquiring the lock or touching the filesystem.

  • daemon/listen_unix.go:10
    The stale probe only treats ECONNREFUSED as stale. If an existing listener closes and unlinks the socket between Lstat and Dial, the dial can return ENOENT, causing startup to fail even though the path is now free.
    Fix: Treat os.ErrNotExist/syscall.ENOENT during the probe as a missing stale socket and proceed to bind, with removal skipped or ErrNotExist ignored.


Panel: ci_default_security | Synthesis: codex, 10s, ~$0.02 | Members: codex_default (codex/default, done, 5m24s, ~$0.69), codex_security (codex/security, done, 2m21s, ~$0.46) | Total: 7m55s, ~$1.18 | Job: 19549

The Unix listen helper now validates the socket path itself before using any store or explicit lock path. That keeps callers from placing the API socket in a shared directory while the lock lives somewhere private, and preserves ParseEndpoint's absolute-path invariant even for manually constructed Endpoint values.

The stale probe also treats ENOENT as a free socket race, which lets startup proceed when another listener closes and unlinks between Lstat and Dial.

Validation: go test ./daemon; go test ./...

🤖 Generated with [OpenAI Codex](https://openai.com/codex)
Co-authored-by: OpenAI Codex <noreply@openai.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci Bot commented Jun 2, 2026

roborev: Fail

High-risk issue remains: the daemon can chmod arbitrary existing runtime parent directories before path validation.

High

  • daemon/listen.go:68 and daemon/lock.go:32
    The new listen and lock paths use ensurePrivateRuntimeDir on arbitrary parent directories, and that helper chmods any directory owned by the current user to 0700. A daemon running as root with a socket or lock under /tmp, /run, or /var/run can silently change system/shared directory permissions instead of rejecting the path.
    Fix: Use non-mutating validation for existing socket/lock parent directories, or only create/chmod a dedicated leaf directory owned by this code; reject existing shared/system directories that are not already private.

Medium

  • daemon/runtime.go:119
    Store-derived listen lock paths still call s.prepareDir() before the later absolute-path check in acquireDaemonLock, so ListenOptions{Store: RuntimeStore{Dir: "."}} can harden/chmod the current working directory before failing because the lock path is relative.
    Fix: Reject or canonicalize relative RuntimeStore.Dir before calling prepareDir() in lock-path helpers, and add coverage for store-derived relative lock paths.

Panel: ci_default_security | Synthesis: codex, 18s | Members: codex_default (codex/default, done, 4m51s), codex_security (codex/security, done, 24s) | Total: 5m33s | Job: 19552

mariusvniekerk and others added 3 commits June 2, 2026 08:49
Daemon Unix socket parent checks must reject unsafe directories without chmodding arbitrary caller paths such as $HOME or /tmp. Add a validation-only safefile primitive and use it for socket parents, while keeping repair-capable EnsurePrivateDir for runtime directories that kit intentionally owns.

RuntimeStore now rejects relative directories before preparation so store-derived lock paths cannot trigger caller-relative directory creation or chmod.

Validation: go test ./daemon ./safefileio; go test ./...

🤖 Generated with [OpenAI Codex](https://openai.com/codex)
Co-authored-by: OpenAI Codex <noreply@openai.com>
Validation-only private directory checks on Windows must enforce the same trust boundary that EnsurePrivateDir repairs to. Otherwise a current-user-owned directory with broad DACL grants could be accepted as private.

ValidatePrivateDir now rejects unprotected DACLs, deny/unknown ACEs, and allowed ACEs for principals outside the current user, token owner, SYSTEM, and Administrators. The Windows test covers a deliberately broadened DACL.

Validation: go test ./safefileio; GOOS=windows GOARCH=amd64 go test -c ./safefileio -o /tmp/kit-safefileio-windows.test.exe; go test ./...

🤖 Generated with [OpenAI Codex](https://openai.com/codex)
Co-authored-by: OpenAI Codex <noreply@openai.com>
daemon.Listen is a public API meant for callers like kata, so requiring an explicit options struct for normal use makes the common path noisier than it needs to be. The branch is not released yet, so prefer the conventional Go WithXYZ option style before consumers adopt it.

This keeps the listen configuration extensible without preserving the clunkier ListenOptions call shape.

Validation: go test ./daemon ./safefileio; GOOS=windows GOARCH=amd64 go test -c ./daemon -o /tmp/kit-daemon-windows.test.exe; GOOS=windows GOARCH=amd64 go test -c ./safefileio -o /tmp/kit-safefileio-windows.test.exe; go test ./...

Generated with OpenAI Codex
Co-authored-by: OpenAI Codex <noreply@openai.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci Bot commented Jun 2, 2026

roborev: Combined Review (25b6fd9)

No issues found.


Synthesized from 2 reviews (agents: codex | types: default, security)


Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 2m51s), codex_security (codex/security, done, 2m41s) | Total: 5m32s | Job: 19579

@mariusvniekerk mariusvniekerk merged commit 98a91e0 into main Jun 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant