Skip to content

feat(agent-server): add docker runtime mode for per-conversation containers#3403

Draft
rbren wants to merge 3 commits into
mainfrom
feat/agent-server-docker-runtime
Draft

feat(agent-server): add docker runtime mode for per-conversation containers#3403
rbren wants to merge 3 commits into
mainfrom
feat/agent-server-docker-runtime

Conversation

@rbren
Copy link
Copy Markdown
Member

@rbren rbren commented May 27, 2026

This PR was created by an AI agent (OpenHands) on behalf of the user.

Motivation

Today every conversation runs in-process on the agent-server, sharing the same filesystem, tmux server, environment, and (most importantly) blast radius. There's no isolation between conversations and no way to give each conversation a clean, throwaway environment without restarting the whole server.

This PR adds an opt-in docker runtime mode to openhands-agent-server: the outer server keeps serving the frontend, auth, settings, profiles, etc., but conversation-scoped work is offloaded into a fresh per-conversation Docker container running a second agent-server (in local mode). The outer server just reverse-proxies HTTP and WebSocket traffic to the right container.

What changes

  • New Config.conversation_runtime: Literal["local", "docker"] (default "local", so existing deployments are untouched).
  • New Config.conversation_image, conversation_container_network, conversation_container_volumes, conversation_container_forward_env, conversation_container_platform, conversation_container_startup_timeout knobs.
  • New package openhands.agent_server.docker_runtime:
    • container_manager.ContainerManager — owns the in-memory conversation_id -> RunningContainer registry, spawns containers via docker run, allocates host ports in the same 30000–39999 range DockerWorkspace uses, mints a per-container session API key, polls /health until ready, tears down on failure.
    • proxy.proxy_http — streams HTTP request bodies and response bodies between the outer and inner servers, replacing the client's X-Session-API-Key with the container's.
    • proxy.bridge_websocket — bidirectional WebSocket bridge using websockets, honoring text/binary frames and either-side close.
    • routers.py — docker-mode replacements for conversation_router, event_router, workspace_router, and the conversation half of sockets_router.
  • api.py learns to install the docker routers (and skip the in-process conversation/event/workspace/bash/git/file/vscode/desktop/skills/hooks/mcp routers) when in docker mode, and to start/stop a ContainerManager in the lifespan.

Endpoint coverage

Every conversation-scoped endpoint is preserved by the catch-all proxy route, so all of these continue to work in docker mode (just executed inside the container):

Surface How it's handled in docker mode
POST /api/conversations spawn container, mint key, rewrite workspace.working_dir to /workspace, forward
GET /api/conversations / /search / /count fan out across all running containers, aggregate items
GET/PATCH /api/conversations/{cid} proxied via catch-all root route
DELETE /api/conversations/{cid} proxy DELETE, then stop container
Everything under /api/conversations/{cid}/... (run, pause, interrupt, secrets, confirmation_policy, switch_profile, switch_llm, condense, fork, agent_final_response, events/, workspace/ including the trajectory download and the static workspace file server) catch-all /{cid}/{tail:path} route streams request and response
WS /sockets/events/{cid} bridged to the inner container's /sockets/events/{cid}
Settings, profiles, workspaces, auth, cloud proxy, server info, static frontend unchanged — these aren't conversation-scoped

Non-goals (intentionally)

  • No pre-warming / no container pool. Each conversation gets a fresh docker run; cold-start is whatever your image's cold-start is.
  • No persistence. The conversation→container map is in-memory; restarting the outer server forgets every running container. Durable mapping is a follow-up.
  • No pagination of fan-out list/search. Each container hosts at most one conversation, so we just concatenate. If a deployment ever holds enough containers that this matters, paginating is a small follow-up.

Tests

  • tests/agent_server/docker_runtime/test_container_manager.py (6 tests) — stubs subprocess.run, exercises real urlopen against a tiny localhost HTTP server, covers happy-path start, idempotency, Docker-unavailable, container-died-during-startup cleanup, single stop, and shutdown() of multiple containers.
  • tests/agent_server/docker_runtime/test_docker_routers.py (9 tests) — boots a real FastAPI "inner" app on an ephemeral port via uvicorn and a stub ContainerManager that points every conversation at it, then exercises the outer app's HTTP and WebSocket surface end-to-end (including the catch-all subpath, fan-out list, count, delete teardown, missing-container 404, and a websocket round-trip). Also asserts local-mode routes are unchanged.

All 1111 existing agent-server tests still pass (verified with the env-isolated run — two pre-existing flakes in test_webhook_subscriber and one host-env contamination in test_terminal_service are unrelated to this PR; each passes in isolation).

Running it

export OH_CONVERSATION_RUNTIME=docker
export OH_CONVERSATION_IMAGE=ghcr.io/openhands/agent-server:latest-python
uvicorn openhands.agent_server.api:create_app --factory

The host needs Docker available; the outer server will fail fast on POST /api/conversations with a 503 if it isn't.

Co-authored-by: openhands openhands@all-hands.dev


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:151cd52-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-151cd52-python \
  ghcr.io/openhands/agent-server:151cd52-python

All tags pushed for this build

ghcr.io/openhands/agent-server:151cd52-golang-amd64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-golang-amd64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-golang-amd64
ghcr.io/openhands/agent-server:151cd52-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:151cd52-golang-arm64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-golang-arm64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-golang-arm64
ghcr.io/openhands/agent-server:151cd52-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:151cd52-java-amd64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-java-amd64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-java-amd64
ghcr.io/openhands/agent-server:151cd52-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:151cd52-java-arm64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-java-arm64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-java-arm64
ghcr.io/openhands/agent-server:151cd52-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:151cd52-python-amd64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-python-amd64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-python-amd64
ghcr.io/openhands/agent-server:151cd52-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:151cd52-python-arm64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-python-arm64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-python-arm64
ghcr.io/openhands/agent-server:151cd52-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:151cd52-golang
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-golang
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-golang
ghcr.io/openhands/agent-server:151cd52-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:151cd52-java
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-java
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-java
ghcr.io/openhands/agent-server:151cd52-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:151cd52-python
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-python
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-python
ghcr.io/openhands/agent-server:151cd52-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 151cd52-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 151cd52-python-amd64) are also available if needed

…ainers

When Config.conversation_runtime == 'docker', every conversation runs
in its own Docker container hosting another agent-server (in local mode).
The outer agent-server acts as a thin reverse proxy in front of the
per-conversation containers:

* POST /api/conversations spawns a fresh container, mints a session
  key, and forwards the create request.
* All other /api/conversations/{cid}/... HTTP routes — including
  /run, /pause, /events/..., /workspace/..., the
  trajectory download, secrets, etc. — are forwarded verbatim to the
  matching container via a catch-all proxy route.
* The /sockets/events/{cid} WebSocket is bridged to the inner
  container with the same session key.
* DELETE /api/conversations/{cid} proxies the delete and then stops
  the container.
* GET /api/conversations, /search and /count fan out across
  the registered containers.

Default behavior (conversation_runtime == 'local') is unchanged.

No container pre-warming, no pools: each conversation gets a fresh
container at first use and an in-memory registry tracks the host port +
session key. Implementation lives entirely in docker_runtime/;
api.py only learns how to install the routers and start/stop the
container manager.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   api.py2692590%107, 109–114, 116, 118, 120, 154, 166, 181, 187, 223–225, 234, 517, 520, 524–526, 528, 534
   config.py78297%29, 42
   conversation_service.py66112481%144–145, 154, 180–181, 185–186, 191, 359–360, 416, 422–423, 426, 434–435, 440, 474, 477, 480–481, 484, 495, 499, 502, 511, 538, 544, 567, 634, 663, 669, 674, 680, 688–689, 698–701, 710, 722, 730, 753–754, 793, 822–826, 828–829, 832–833, 843, 855–860, 958, 965–969, 972–973, 977–981, 984–985, 989–993, 996–997, 1019–1020, 1024–1025, 1027–1029, 1031, 1034, 1042–1046, 1049, 1056–1061, 1063–1064, 1092, 1102, 1106, 1108–1109, 1114–1115, 1121–1122, 1130, 1145–1146, 1176, 1191, 1223, 1517, 1520
openhands-agent-server/openhands/agent_server/docker_runtime
   proxy.py931979%102, 117–120, 179, 182, 184, 187, 198, 201–202, 208, 212, 229–233
   registry.py704338%53–55, 59, 62, 65, 76–79, 81–83, 86–91, 96–103, 116–118, 123–125, 127, 135, 140, 151, 153, 183, 188, 193, 198, 205
   routers.py1502285%65, 124–125, 133–134, 142–143, 171, 176, 179–181, 205, 228–229, 253–255, 352–353, 365, 465
openhands-workspace/openhands/workspace/docker
   workspace.py1955273%31–33, 51, 162, 168, 174, 194, 215, 218, 221–222, 225–226, 233, 242, 244, 247–248, 253, 263, 290, 318, 327, 330, 334–335, 339–340, 344, 347–348, 350–356, 359–360, 369–371, 375–377, 381, 385, 411, 431, 448
TOTAL290051214358% 

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

Docker runtime mode works for the core create/proxy/WebSocket/delete flow, but I found two conversation API compatibility regressions in docker mode.

Does this PR achieve its stated goal?

Partially. I verified a real outer uvicorn agent-server in OH_CONVERSATION_RUNTIME=docker mode pulled the documented ghcr.io/openhands/agent-server:latest-python image, created a per-conversation Docker container, rewrote the workspace to /workspace, proxied GET /api/conversations/{id}, bridged /sockets/events/{id}, and removed the container on DELETE. However, two claimed preserved endpoints do not match local-mode behavior: GET /api/conversations?ids=<id> returns 500 in docker mode, and /api/conversations/count changes the response shape from a raw number to an object.

Phase Result
Environment Setup make build succeeded; Docker daemon was available (28.0.4) and the documented runtime image pulled successfully.
CI Status ⚠️ At check time, pre-commit was failing and several jobs were still pending; multiple tests/checks were green. I did not rerun CI tests.
Functional Verification ⚠️ Core docker runtime path works, but list-by-ids and count compatibility issues were reproduced with real HTTP requests.
Functional Verification

Test 1: Baseline local-mode API contract

Step 1 — Establish baseline (local mode):
Started the server with OH_CONVERSATION_RUNTIME=local and created a conversation using the normal HTTP API. Then queried the existing list/count endpoints:

curl "http://127.0.0.1:18081/api/conversations?ids=$LCID"
# HTTP 200, body: [{"id":"526d00e9-fefa-45a2-b355-dfdc9f53802f", ...}]

curl "http://127.0.0.1:18081/api/conversations/count"
# HTTP 200, body: 1

This establishes the existing client-visible contract: ids lookup returns a JSON array, and count returns a raw JSON number.

Test 2: PR docker runtime core flow

Step 2 — Apply PR behavior:
Started the PR server with:

OH_CONVERSATION_RUNTIME=docker OH_CONVERSATION_CONTAINER_STARTUP_TIMEOUT=90   uv run uvicorn openhands.agent_server.api:create_app --factory --host 127.0.0.1 --port 18080

Created a conversation through the outer server:

curl -H 'Content-Type: application/json' --data @/tmp/pr-start.json   http://127.0.0.1:18080/api/conversations
# HTTP 201, id=1e74d784-b1c0-4fad-b142-27e7c1bc7343,
# workspace.working_dir=/workspace

docker ps --filter 'name=oh-conv-'
# ebbdda2c8f99 oh-conv-1e74d784b1c04fadb14227e7c1bc7343-79ca09c1 ... 0.0.0.0:30450->8000/tcp

This confirms the PR creates a real per-conversation container and rewrites the workspace path into the container.

Step 3 — Exercise proxied traffic:

curl "http://127.0.0.1:18080/api/conversations/$CID"
# HTTP 200, returned the created conversation with workspace.working_dir=/workspace

curl "http://127.0.0.1:18080/api/conversations/search"
# HTTP 200, returned items containing id=1e74d784-b1c0-4fad-b142-27e7c1bc7343

uv run python /tmp/qa_ws_check.py
# connected
# {"id":"5d0e05be-01b2-441e-9f76-975d9f00673c","timestamp":"2026-05-27T14...

curl -X DELETE "http://127.0.0.1:18080/api/conversations/$CID"
# HTTP 200, body: {"success":true}

docker ps --filter 'name=oh-conv-'
# no remaining QA containers

This confirms root HTTP proxying, search aggregation, WebSocket bridging, and DELETE cleanup work in a real Docker-backed run.

Test 3: Reproduced docker-mode compatibility regressions

Step 1 — Baseline: local mode returned HTTP 200 with a JSON array for GET /api/conversations?ids=<id> and raw 1 for /count.

Step 2 — PR docker mode: the same user-facing endpoints behaved differently:

curl "http://127.0.0.1:18080/api/conversations?ids=$CID"
# HTTP 500
# {"detail":"Internal Server Error","exception":"'list' object has no attribute 'get'"}

curl "http://127.0.0.1:18080/api/conversations/count"
# HTTP 200
# {"count":1}

This shows docker mode does not fully preserve the existing conversation endpoint contract promised in the PR description.

Issues Found

  • 🟠 Issue: GET /api/conversations?ids=<conversation_id> returns 500 in docker mode instead of the local-mode JSON array response.
  • 🟠 Issue: GET /api/conversations/count changes response shape from raw JSON number (1) to object ({"count":1}).

This review was created by an AI agent (OpenHands) on behalf of the user.

Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable direction, but I found a few docker-mode issues that need attention before this is safe to merge: auth bypass, exposed inner servers, and REST/auth contract regressions.

This review was created by an AI agent (OpenHands) on behalf of the user.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🔴 HIGH — this is opt-in, but it changes request routing/authentication and starts network-reachable per-conversation servers.

VERDICT: ❌ Needs rework before merging.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26516796937

Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/container_manager.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated
Comment thread openhands-agent-server/openhands/agent_server/api.py Outdated
Six fixes for the per-conversation docker runtime, driven by reviewer
findings on PR #3403:

1. Bind inner container ports to loopback only (-p 127.0.0.1:HOST:8000)
   so the per-conversation agent-servers can only be reached through the
   outer server's authenticated proxy. (R3311480573)

2. Authenticate the WebSocket bridge against the OUTER server's session
   keys before opening the upstream connection. Reuses the existing
   sockets.py helper (header / query / first-message auth), and the
   bridge no longer calls accept() a second time. (R3311480598)

3. Preserve the local GET /api/conversations?ids=... contract: route is
   batch-get-by-id, requires the ids query param, returns
   list[ConversationInfo | None]. Looks each id up in the registry and
   fetches from its container (None for missing). (R3311480555,
   R3311480542)

4. Preserve the local /api/conversations/count contract: returns a bare
   JSON integer (not {"count": N}), honors ?status= by forwarding the
   query to each inner container and summing their integers.
   (R3311480576, R3311480571)

5. ContainerManager.start() now returns (running, is_new). The POST
   route only tears down the container on inner 4xx / connection error
   when is_new=True, so a retried create against an existing
   conversation can no longer kill the live container. (R3311480570)

6. Workspace static-file routes mount under the workspace-cookie auth
   group in docker mode via a new docker_workspace_router. The
   workspace router is now registered before the header-only api_router
   so the more specific path wins; browser iframe/<img> embeds with the
   oh_workspace_session_key cookie continue to work. (R3311480585)

Tests:
* test_container_manager: assert loopback port binding; updated for the
  (running, is_new) return tuple, plus an explicit is_new=False assert
  on the idempotent second start.
* test_docker_routers: new tests for batch-get-by-ids (incl. 422 on
  missing ids, null slots for unknown ids), bare-int /count contract,
  WS rejects wrong key, WS rejects missing first-message auth, WS
  accepts with valid outer key, POST retry preserves existing container
  on inner 4xx, fresh-create cleans up on inner 4xx, workspace route
  registered before the catch-all. Fake inner app reordered so /search
  and /count aren't shadowed by /{cid}.

22 docker_runtime tests pass; 144 tests in api / conversation /
workspace / docker_runtime all green.

Co-authored-by: openhands <openhands@all-hands.dev>
@rbren
Copy link
Copy Markdown
Member Author

rbren commented May 27, 2026

Pushed e7ec1a7 addressing all 8 review threads (now resolved). Summary:

Critical (security)

  • WS bridge now authenticates against the OUTER server's session_api_keys before opening the upstream connection (reuses sockets._accept_authenticated_websocket; no double-accept()). Wrong-key rejection happens pre-accept; missing-key falls through to first-message-auth and closes 4001 on timeout / bad frame.
  • Inner container ports now bind to 127.0.0.1 only — the per-conversation agent-server is only reachable through the outer auth proxy.

Important (API contracts)

  • GET /api/conversations?ids=... restored to local contract: required ids query, returns list[ConversationInfo | None] (not a page object), missing ids slot in as null. List/search aggregation lives only at /search.
  • GET /api/conversations/count restored to local contract: bare JSON integer, honors ?status= by forwarding the query to each container and summing.
  • ContainerManager.start() now returns (running, is_new); the POST handler only tears down on inner 4xx / connect-error when is_new=True, so a retried create against an existing conversation can't kill the live container.
  • Workspace static-file routes (/conversations/{cid}/workspace/...) are now served by a new docker_workspace_router mounted on workspace_api_router (cookie-or-header auth), and workspace_api_router is registered before the header-only api_router so the more specific path wins over the catch-all. Browser iframe/<img> embeds with oh_workspace_session_key continue to work in docker mode.

Tests
8 new regression tests (1 per review finding) plus updates to the existing ones. 22 docker_runtime tests + 144 tests across test_api / test_conversation_router / test_workspace_router / docker_runtime/ all pass; ruff + pyright clean.

This comment was posted by an AI agent (OpenHands) on behalf of the user.

@rbren rbren marked this pull request as draft May 27, 2026 17:06
…n-out for shared-disk read-only metadata

* Drop the 339-line ContainerManager and its bespoke docker run
  wrapping; replace with DockerConversationRegistry (190 lines)
  that's a thin shell around DockerWorkspace. Image pulls, GPU,
  network, port allocation, log streaming, healthchecks, lifecycle
  cleanup are all delegated.
* Add bind_host to DockerWorkspace so the outer can publish the
  inner agent-server on 127.0.0.1 only (defense-in-depth: only the
  outer reaches the inner; other hosts on the network can't bypass
  outer auth).
* Replace the docker fan-out across containers (batch_get / count /
  search) with shared-disk metadata reads. Outer's ConversationService
  runs in a new read_only_metadata mode: skips lease acquisition
  and EventService startup; get / search / count /
  batch_get re-read meta.json / base_state.json off disk on
  every call so sub-container writes show up immediately.
* Bind-mount layout: per-cid conversations/{cid_hex} is the only
  conversation dir each sub-container can see (the outer sees all of
  them and reads on-disk metadata). Settings/secrets dir is shared
  via OH_PERSISTENCE_DIR so cipher keys match.
* Global per-host routers (bash/git/file/vscode/desktop/hooks/mcp/
  skills/tools/llm) are reverse-proxied via a required ?cid=…
  query parameter — registered one route per prefix so the catch-all
  doesn't shadow /api/conversations, /api/settings, etc.
* Auth: outer and inner share OH_SESSION_API_KEYS_0 via
  conversation_container_forward_env. The proxy synthesizes the
  X-Session-API-Key header from the shared workspace key when the
  inbound request authenticated via the workspace-session cookie (so
  iframe/<img> embeds still reach the inner static file server).
* Drop the fan-out tests; add tests for the read-only mode + ?cid=
  routing. Net diff: -132 LoC across the package while adding new
  behaviour-level tests.

Co-authored-by: openhands <openhands@all-hands.dev>
@rbren
Copy link
Copy Markdown
Member Author

rbren commented May 27, 2026

Pushed 151cd525 — simplification rewrite per the feedback that the previous design duplicated a lot of DockerWorkspace. Two structural changes:

1. Drop ContainerManager, use DockerWorkspace directly

docker_runtime/container_manager.py (339 lines of bespoke docker run wrapping, port allocation, healthchecks, log streaming, lifecycle cleanup, image management) is deleted and replaced with docker_runtime/registry.py (190 lines) which is a thin shell around DockerWorkspace. Everything ContainerManager did is something DockerWorkspace was already doing for its other use cases:

Concern ContainerManager DockerWorkspace
docker run wrapping bespoke argv builder yes
Free port allocation bespoke yes
Image pulls & cleanup bespoke yes (cleanup_image)
Network / GPU / platform partial yes
Volume mounts bespoke yes
Forwarded env bespoke yes (forward_env + extra_env)
Log streaming bespoke yes (detach_logs)
Healthcheck wait bespoke urlopen loop yes (health_check_timeout)
Lifecycle / cleanup bespoke yes (cleanup)

I added one small field to DockerWorkspace to cover the one capability that wasn't already there:

  • bind_host: str — host interface to publish on. Default "" keeps -p HOST_PORT:8000; setting "127.0.0.1" gives -p 127.0.0.1:HOST_PORT:8000. The docker registry pins this to 127.0.0.1 so only the outer agent-server can reach the inner — defense-in-depth around the proxy auth.

2. Drop fan-out across containers; read shared disk instead

Per the review pushback, fan-out was the wrong shape — it was N container hops for what's fundamentally a cheap directory walk. The outer's ConversationService now has a read_only_metadata mode that:

  • Skips EventService startup in __aenter__ (no leases acquired, no in-memory state, no lease-renewal task).
  • On every get / search / count / batch_get, re-walks conversations_path and reads meta.json + base_state.json straight off disk. Falls back to a synthesized state for conversations whose base_state.json hasn't been flushed yet.
  • Mutation methods aren't expected to be called (the docker proxy router intercepts them before they reach ConversationService).

Bind-mount layout is per-cid: each sub-container only sees its own conversations/{cid_hex} subdirectory. The outer sees all of them. The .openhands settings/secrets dir is shared so OH_SECRET_KEY round-trips correctly.

Other changes asked for in the review

  • ?cid= for global routers (bash/git/file/vscode/desktop/hooks/mcp/skills/tools/llm): registered one specific route per prefix in docker_global_proxy_router so the catch-all doesn't shadow /api/conversations / /api/settings / etc. Missing ?cid= → clear 400 telling the client what they need to do.
  • X-Session-API-Key forwarding: outer and inner share the same OH_SESSION_API_KEYS_0 via conversation_container_forward_env (now includes that key in the default list). The proxy passes through whatever header the client sent; for cookie-authed workspace static files it synthesizes the header from workspace.api_key (read out of the outer's env) so the inner static file server is happy.
  • No outer-side services touched for the simpler approach. The outer still runs tmux / vscode / desktop / sockets / settings / profiles in-process — those just aren't conversation-scoped.

Verification

The new architecture still answers every API the user asked about:

Endpoint group Where it runs in docker mode
POST /api/conversations proxy → sub-container (spawns it first)
`GET /api/conversations[/count /search
Per-cid mutations (/run, /pause, /events, …) proxy → sub-container
Workspace static files (/conversations/{cid}/workspace/…) proxy → sub-container, cookie-auth preserved
Global routers (/bash, /git, /file, …) proxy → sub-container, requires ?cid=
WS /sockets/events/{cid} outer authenticates, then bridges to sub-container

Stats

13 files changed, 1083 insertions(+), 1215 deletions(-)

Net -132 LoC even though new tests were added. Locally:

tests/agent_server/test_conversation_service.py        80 passed (4 new read-only-mode tests)
tests/agent_server/test_conversation_router.py         69 passed (no changes)
tests/agent_server/docker_runtime/test_docker_routers  17 passed (rewritten for new registry)

ruff check, ruff format, and pyright all clean on the changed files.


This comment was created by an AI agent (OpenHands) on behalf of the PR author.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants