Skip to content

jamesburchill/safeagent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Safe Agent Sandbox Starter Pack

Project home: https://safeagent.ca

This repository is a reference scaffold for running an OpenClaw-style coding agent inside a disposable sandbox with constrained networking, a simple policy check, and audit logging.

It is intended for people who want to experiment with coding agents more safely than running them directly on a host machine or a valuable working copy. It is not a production-ready security boundary, but it is a practical starting point for reducing blast radius during testing and exploration.

safeagent.ca currently redirects here so the project is easier to share and reference.

What is included

  • docker-compose.yml -- control plane stack for the sample API
  • control-plane/app.py -- minimal FastAPI control plane
  • control-plane/requirements.txt -- Python dependencies
  • control-plane/Dockerfile -- control plane image build
  • sandbox-images/Dockerfile -- non-root sandbox image with a small toolset
  • configs/policy.yaml -- inspect-only default policy
  • configs/policy.execution.yaml -- opt-in policy for repo-defined test and build execution
  • configs/nftables-agent.conf -- host-side egress example
  • scripts/launch-sandbox.sh -- disposable sandbox launcher
  • scripts/apply-nftables.sh -- helper to load nftables rules
  • .env.example -- basic environment variables

High-level design

User/Job -> Control Plane -> Disposable Sandbox -> Result

With separate audit logging for every tool request and policy decision.

What it aims to do

The default layout aims to reduce blast radius by using:

  • one sandbox per job
  • non-root sandbox user
  • read-only container root filesystem
  • writable workspace only
  • no privileged mode
  • no Docker socket in the sandbox
  • host-enforced egress filtering
  • no long-lived secrets in the sandbox
  • approval required for risky commands

What it does

  • It provides a safer place to experiment than running an agent directly on a primary machine or a non-disposable repository.
  • It shows how to separate a control plane from a disposable job sandbox.
  • It demonstrates a simple pattern for running one sandbox per job with a non-root user, a read-only root filesystem, and a writable workspace.
  • It includes an inspect-only default policy and a separate opt-in execution policy so readers can choose a safer starting point.
  • It validates the source repository before copying it into the job workspace and rejects symlinks or other special file types by default.
  • It shows how outbound access can be narrowed with Docker network profiles and host-side nftables rules.
  • It records policy decisions, approval requests, approval decisions, sandbox launches, and command output in an audit log, which makes the flow easier to inspect and test.
  • It is small enough to read end to end, which makes it useful for experimentation, review, and adaptation.

What it does not guarantee

  • It does not provide high-assurance isolation.
  • It does not prevent repo-defined commands such as pytest, make test, or similar build and test commands from executing arbitrary code inside the sandbox.
  • It does not replace careful host firewalling, credential handling, or repo-specific policy design.
  • It does not remove the need for a stronger runtime, such as a microVM, if the threat model includes a determined rogue agent.

In short: this repository is useful for cautious experimentation and learning. It is meaningfully safer than running an agent directly on a host, but it is not a complete defence against a determined rogue agent.

Choosing a policy

The default policy in configs/policy.yaml is inspect-only. It allows read and inspection commands such as ls, find, cat, grep, git status, and git diff, but it does not allow repo-defined test or build commands to run automatically.

If you want to allow commands such as pytest, python3 -m pytest, or make test, set POLICY_PATH=/app/configs/policy.execution.yaml before starting the stack. That policy is explicitly less conservative because those commands can execute arbitrary code from the repository being analysed.

The sandbox image included here provides a small base toolset, including pytest and make. If you want to run other language-specific toolchains, extend sandbox-images/Dockerfile and update the policy to match.

The control plane enforces the policy's path rules for a subset of file-oriented commands, truncates API and audit output to max_output_kb per stream, and records the number of changed workspace files. If max_file_writes is exceeded, the job is marked as failed after execution and the audit log records the policy violation.

Workspace safety checks

Before a job workspace is created, the control plane validates the source repository tree. By default it rejects symlinks and special file types such as device nodes, sockets, and named pipes. This keeps the copied workspace simpler and reduces the chance that a repository can smuggle unexpected filesystem behaviour into the sandbox.

If a repository fails this validation, the job is rejected and the reason is written to the audit log.

Approval workflow

When a command matches require_approval, the control plane creates a pending approval record instead of running the job immediately. The approval record stores the reviewed command, the repository path, and the requested network profile.

The minimal approval API is:

  • GET /approvals/{approval_id} to inspect a pending or completed approval record
  • POST /approvals/{approval_id}/approve to approve and execute the exact reviewed command
  • POST /approvals/{approval_id}/deny to reject the request without running it

Approval records are stored on disk under APPROVALS_ROOT. This is intentionally simple and easy to inspect, not a full multi-user approval system.

The approval endpoints require an X-Approval-Token header that matches APPROVAL_TOKEN. This is intentionally minimal and suitable for local experimentation, but it is still not a full production approval system.

Quick start

  1. Copy .env.example to .env and adjust values. Set APPROVAL_TOKEN to a non-default value before exposing the API beyond a local test environment.
  2. Review configs/policy.yaml and keep it as the default unless you deliberately want to allow repo-defined test or build execution.
  3. Review configs/nftables-agent.conf and replace placeholder IPs and interfaces.
  4. Create a source repository inside the mounted workspace root so it is visible both to the control plane container and to the host Docker daemon:
mkdir -p ./workspaces/example-repo
printf 'hello\n' > ./workspaces/example-repo/hello.txt
printf 'delete me\n' > ./workspaces/example-repo/delete-me.txt
  1. Build the sandbox image:
docker build -t agent-safe-sandbox:latest ./sandbox-images
  1. Build and start the stack:
docker compose up --build -d
  1. Apply host firewall rules from the host, not from inside a container:
sudo ./scripts/apply-nftables.sh
  1. Submit an inspect-only job to the control plane:
curl -X POST http://localhost:8080/jobs \
  -H 'Content-Type: application/json' \
  -d '{
    "repo_path": "/opt/agent-stack/workspaces/example-repo",
    "command": "ls",
    "network_profile": "none"
  }'
  1. Submit a command that requires approval:
curl -X POST http://localhost:8080/jobs \
  -H 'Content-Type: application/json' \
  -d '{
    "repo_path": "/opt/agent-stack/workspaces/example-repo",
    "command": "rm delete-me.txt",
    "network_profile": "none"
  }'
  1. Inspect the approval request:
curl http://localhost:8080/approvals/<approval_id> \
  -H 'X-Approval-Token: change-me-approval-token'
  1. Approve the request and run the reviewed command:
curl -X POST http://localhost:8080/approvals/<approval_id>/approve \
  -H 'Content-Type: application/json' \
  -H 'X-Approval-Token: change-me-approval-token' \
  -d '{
    "reviewer": "operator",
    "note": "approved for test run"
  }'

Operational notes

  • The control plane is intentionally small. It reads the policy from POLICY_PATH, normalises the requested command, rejects shell control operators, checks it against the selected policy, and launches the sandbox without a shell.
  • repo_path must point to a location that exists inside the control plane container. In the default compose setup, that means a path under /opt/agent-stack/workspaces.
  • Before copying the repository into /workspace, the control plane validates the source tree and rejects symlinks and special file types.
  • The control plane enforces path restrictions on ls, find, cat, grep, rm, pytest, and python3 -m pytest, and denies dangerous find actions such as -exec or -delete.
  • git diff --no-index is denied so that git diff cannot be used as an unrestricted file comparison escape hatch outside /workspace.
  • The control plane uses HOST_WORKSPACES_ROOT when it asks the host Docker daemon to bind-mount a job workspace into the sandbox. In the default compose setup, this should point to the absolute host path for ./workspaces.
  • Approval records are stored on disk under APPROVALS_ROOT, and the approval endpoints execute the exact normalised command that was reviewed.
  • Approval endpoints require X-Approval-Token, which must match APPROVAL_TOKEN.
  • Network profiles map to Docker networks through SANDBOX_NETWORK_MODEL_ONLY, SANDBOX_NETWORK_GIT_READ, and SANDBOX_NETWORK_PACKAGE_MIRROR. The control plane creates a missing profile network on demand; none uses Docker's built-in none network.
  • The sample compose file includes only the control plane. The sandbox image is built separately and launched by the control plane through the host Docker socket.
  • max_output_kb is applied per stream to API responses and audit records. max_file_writes is checked after the command finishes, so it detects excessive workspace changes rather than preventing the first extra write.
  • The host-side nftables example is part of the security model. Review and adapt it before relying on outbound restrictions.
  • The control plane has access to the Docker socket in order to launch disposable sandboxes. Treat the control plane as a high-trust component.

How to read this repository

  • Use it as a starting point for cautious experimentation, discussion, or prototyping.
  • Read configs/policy.yaml as a conservative default and configs/policy.execution.yaml as an explicit opt-in for repo-defined execution.
  • Treat the workspace copy step as intentionally conservative: repositories with symlinks or special files are rejected rather than partially copied.
  • Read configs/nftables-agent.conf as an example egress control, not as a drop-in firewall policy.
  • Assume that stronger controls are required before exposing this pattern to untrusted workloads.

Filesystem layout

/opt/agent-stack/
  control-plane/
  sandbox-images/
  audit/
  approvals/
  workspaces/
  configs/
  scripts/

Disclaimer

This package is a reference scaffold. It may reduce risk, but it does not eliminate risk.

About

SafeAgent is a Dockerized execution layer for AI agents that enforces boundaries, controls access, and keeps agent behaviour inside systems you own.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project