feat(agent-server): add docker runtime mode for per-conversation containers by rbren · Pull Request #3403 · OpenHands/software-agent-sdk

rbren · 2026-05-27T14:15:06Z

This PR was created by an AI agent (OpenHands) on behalf of the user.

Motivation

Today every conversation runs in-process on the agent-server, sharing the same filesystem, tmux server, environment, and (most importantly) blast radius. There's no isolation between conversations and no way to give each conversation a clean, throwaway environment without restarting the whole server.

This PR adds an opt-in docker runtime mode to openhands-agent-server: the outer server keeps serving the frontend, auth, settings, profiles, etc., but conversation-scoped work is offloaded into a fresh per-conversation Docker container running a second agent-server (in local mode). The outer server just reverse-proxies HTTP and WebSocket traffic to the right container.

What changes

New Config.conversation_runtime: Literal["local", "docker"] (default "local", so existing deployments are untouched).
New Config.conversation_image, conversation_container_network, conversation_container_volumes, conversation_container_forward_env, conversation_container_platform, conversation_container_startup_timeout knobs.
New package openhands.agent_server.docker_runtime:
- container_manager.ContainerManager — owns the in-memory conversation_id -> RunningContainer registry, spawns containers via docker run, allocates host ports in the same 30000–39999 range DockerWorkspace uses, mints a per-container session API key, polls /health until ready, tears down on failure.
- proxy.proxy_http — streams HTTP request bodies and response bodies between the outer and inner servers, replacing the client's X-Session-API-Key with the container's.
- proxy.bridge_websocket — bidirectional WebSocket bridge using websockets, honoring text/binary frames and either-side close.
- routers.py — docker-mode replacements for conversation_router, event_router, workspace_router, and the conversation half of sockets_router.
api.py learns to install the docker routers (and skip the in-process conversation/event/workspace/bash/git/file/vscode/desktop/skills/hooks/mcp routers) when in docker mode, and to start/stop a ContainerManager in the lifespan.

Endpoint coverage

Every conversation-scoped endpoint is preserved by the catch-all proxy route, so all of these continue to work in docker mode (just executed inside the container):

Surface	How it's handled in docker mode
`POST /api/conversations`	spawn container, mint key, rewrite `workspace.working_dir` to `/workspace`, forward
`GET /api/conversations` / `/search` / `/count`	fan out across all running containers, aggregate `items`
`GET/PATCH /api/conversations/{cid}`	proxied via catch-all root route
`DELETE /api/conversations/{cid}`	proxy DELETE, then stop container
Everything under `/api/conversations/{cid}/...` (run, pause, interrupt, secrets, confirmation_policy, switch_profile, switch_llm, condense, fork, agent_final_response, events/, workspace/ including the trajectory download and the static workspace file server)	catch-all `/{cid}/{tail:path}` route streams request and response
`WS /sockets/events/{cid}`	bridged to the inner container's `/sockets/events/{cid}`
Settings, profiles, workspaces, auth, cloud proxy, server info, static frontend	unchanged — these aren't conversation-scoped

Non-goals (intentionally)

No pre-warming / no container pool. Each conversation gets a fresh docker run; cold-start is whatever your image's cold-start is.
No persistence. The conversation→container map is in-memory; restarting the outer server forgets every running container. Durable mapping is a follow-up.
No pagination of fan-out list/search. Each container hosts at most one conversation, so we just concatenate. If a deployment ever holds enough containers that this matters, paginating is a small follow-up.

Tests

tests/agent_server/docker_runtime/test_container_manager.py (6 tests) — stubs subprocess.run, exercises real urlopen against a tiny localhost HTTP server, covers happy-path start, idempotency, Docker-unavailable, container-died-during-startup cleanup, single stop, and shutdown() of multiple containers.
tests/agent_server/docker_runtime/test_docker_routers.py (9 tests) — boots a real FastAPI "inner" app on an ephemeral port via uvicorn and a stub ContainerManager that points every conversation at it, then exercises the outer app's HTTP and WebSocket surface end-to-end (including the catch-all subpath, fan-out list, count, delete teardown, missing-container 404, and a websocket round-trip). Also asserts local-mode routes are unchanged.

All 1111 existing agent-server tests still pass (verified with the env-isolated run — two pre-existing flakes in test_webhook_subscriber and one host-env contamination in test_terminal_service are unrelated to this PR; each passes in isolation).

Running it

export OH_CONVERSATION_RUNTIME=docker
export OH_CONVERSATION_IMAGE=ghcr.io/openhands/agent-server:latest-python
uvicorn openhands.agent_server.api:create_app --factory

The host needs Docker available; the outer server will fail fast on POST /api/conversations with a 503 if it isn't.

Co-authored-by: openhands openhands@all-hands.dev

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:151cd52-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-151cd52-python \
  ghcr.io/openhands/agent-server:151cd52-python

All tags pushed for this build

ghcr.io/openhands/agent-server:151cd52-golang-amd64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-golang-amd64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-golang-amd64
ghcr.io/openhands/agent-server:151cd52-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:151cd52-golang-arm64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-golang-arm64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-golang-arm64
ghcr.io/openhands/agent-server:151cd52-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:151cd52-java-amd64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-java-amd64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-java-amd64
ghcr.io/openhands/agent-server:151cd52-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:151cd52-java-arm64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-java-arm64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-java-arm64
ghcr.io/openhands/agent-server:151cd52-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:151cd52-python-amd64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-python-amd64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-python-amd64
ghcr.io/openhands/agent-server:151cd52-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:151cd52-python-arm64
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-python-arm64
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-python-arm64
ghcr.io/openhands/agent-server:151cd52-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:151cd52-golang
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-golang
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-golang
ghcr.io/openhands/agent-server:151cd52-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:151cd52-java
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-java
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-java
ghcr.io/openhands/agent-server:151cd52-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:151cd52-python
ghcr.io/openhands/agent-server:151cd5256f33d2a191db11573a54520450d32668-python
ghcr.io/openhands/agent-server:feat-agent-server-docker-runtime-python
ghcr.io/openhands/agent-server:151cd52-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 151cd52-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 151cd52-python-amd64) are also available if needed

…ainers When Config.conversation_runtime == 'docker', every conversation runs in its own Docker container hosting another agent-server (in local mode). The outer agent-server acts as a thin reverse proxy in front of the per-conversation containers: * POST /api/conversations spawns a fresh container, mints a session key, and forwards the create request. * All other /api/conversations/{cid}/... HTTP routes — including /run, /pause, /events/..., /workspace/..., the trajectory download, secrets, etc. — are forwarded verbatim to the matching container via a catch-all proxy route. * The /sockets/events/{cid} WebSocket is bridged to the inner container with the same session key. * DELETE /api/conversations/{cid} proxies the delete and then stops the container. * GET /api/conversations, /search and /count fan out across the registered containers. Default behavior (conversation_runtime == 'local') is unchanged. No container pre-warming, no pools: each conversation gets a fresh container at first use and an in-memory registry tracks the host port + session key. Implementation lives entirely in docker_runtime/; api.py only learns how to install the routers and start/stop the container manager. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-27T14:15:38Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-27T14:15:46Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-27T14:17:44Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
api.py	269	25	90%	107, 109–114, 116, 118, 120, 154, 166, 181, 187, 223–225, 234, 517, 520, 524–526, 528, 534
config.py	78	2	97%	29, 42
conversation_service.py	661	124	81%	144–145, 154, 180–181, 185–186, 191, 359–360, 416, 422–423, 426, 434–435, 440, 474, 477, 480–481, 484, 495, 499, 502, 511, 538, 544, 567, 634, 663, 669, 674, 680, 688–689, 698–701, 710, 722, 730, 753–754, 793, 822–826, 828–829, 832–833, 843, 855–860, 958, 965–969, 972–973, 977–981, 984–985, 989–993, 996–997, 1019–1020, 1024–1025, 1027–1029, 1031, 1034, 1042–1046, 1049, 1056–1061, 1063–1064, 1092, 1102, 1106, 1108–1109, 1114–1115, 1121–1122, 1130, 1145–1146, 1176, 1191, 1223, 1517, 1520
openhands-agent-server/openhands/agent_server/docker_runtime
proxy.py	93	19	79%	102, 117–120, 179, 182, 184, 187, 198, 201–202, 208, 212, 229–233
registry.py	70	43	38%	53–55, 59, 62, 65, 76–79, 81–83, 86–91, 96–103, 116–118, 123–125, 127, 135, 140, 151, 153, 183, 188, 193, 198, 205
routers.py	150	22	85%	65, 124–125, 133–134, 142–143, 171, 176, 179–181, 205, 228–229, 253–255, 352–353, 365, 465
openhands-workspace/openhands/workspace/docker
workspace.py	195	52	73%	31–33, 51, 162, 168, 174, 194, 215, 218, 221–222, 225–226, 233, 242, 244, 247–248, 253, 263, 290, 318, 327, 330, 334–335, 339–340, 344, 347–348, 350–356, 359–360, 369–371, 375–377, 381, 385, 411, 431, 448
TOTAL	29005	12143	58%

all-hands-bot

⚠️ QA Report: PASS WITH ISSUES

Docker runtime mode works for the core create/proxy/WebSocket/delete flow, but I found two conversation API compatibility regressions in docker mode.

Does this PR achieve its stated goal?

Partially. I verified a real outer uvicorn agent-server in OH_CONVERSATION_RUNTIME=docker mode pulled the documented ghcr.io/openhands/agent-server:latest-python image, created a per-conversation Docker container, rewrote the workspace to /workspace, proxied GET /api/conversations/{id}, bridged /sockets/events/{id}, and removed the container on DELETE. However, two claimed preserved endpoints do not match local-mode behavior: GET /api/conversations?ids=<id> returns 500 in docker mode, and /api/conversations/count changes the response shape from a raw number to an object.

Phase	Result
Environment Setup	✅ `make build` succeeded; Docker daemon was available (`28.0.4`) and the documented runtime image pulled successfully.
CI Status	⚠️ At check time, `pre-commit` was failing and several jobs were still pending; multiple tests/checks were green. I did not rerun CI tests.
Functional Verification	⚠️ Core docker runtime path works, but list-by-ids and count compatibility issues were reproduced with real HTTP requests.

Functional Verification

Test 1: Baseline local-mode API contract

Step 1 — Establish baseline (local mode):
Started the server with OH_CONVERSATION_RUNTIME=local and created a conversation using the normal HTTP API. Then queried the existing list/count endpoints:

curl "http://127.0.0.1:18081/api/conversations?ids=$LCID"
# HTTP 200, body: [{"id":"526d00e9-fefa-45a2-b355-dfdc9f53802f", ...}]

curl "http://127.0.0.1:18081/api/conversations/count"
# HTTP 200, body: 1

This establishes the existing client-visible contract: ids lookup returns a JSON array, and count returns a raw JSON number.

Test 2: PR docker runtime core flow

Step 2 — Apply PR behavior:
Started the PR server with:

OH_CONVERSATION_RUNTIME=docker OH_CONVERSATION_CONTAINER_STARTUP_TIMEOUT=90   uv run uvicorn openhands.agent_server.api:create_app --factory --host 127.0.0.1 --port 18080

Created a conversation through the outer server:

curl -H 'Content-Type: application/json' --data @/tmp/pr-start.json   http://127.0.0.1:18080/api/conversations
# HTTP 201, id=1e74d784-b1c0-4fad-b142-27e7c1bc7343,
# workspace.working_dir=/workspace

docker ps --filter 'name=oh-conv-'
# ebbdda2c8f99 oh-conv-1e74d784b1c04fadb14227e7c1bc7343-79ca09c1 ... 0.0.0.0:30450->8000/tcp

This confirms the PR creates a real per-conversation container and rewrites the workspace path into the container.

Step 3 — Exercise proxied traffic:

curl "http://127.0.0.1:18080/api/conversations/$CID"
# HTTP 200, returned the created conversation with workspace.working_dir=/workspace

curl "http://127.0.0.1:18080/api/conversations/search"
# HTTP 200, returned items containing id=1e74d784-b1c0-4fad-b142-27e7c1bc7343

uv run python /tmp/qa_ws_check.py
# connected
# {"id":"5d0e05be-01b2-441e-9f76-975d9f00673c","timestamp":"2026-05-27T14...

curl -X DELETE "http://127.0.0.1:18080/api/conversations/$CID"
# HTTP 200, body: {"success":true}

docker ps --filter 'name=oh-conv-'
# no remaining QA containers

This confirms root HTTP proxying, search aggregation, WebSocket bridging, and DELETE cleanup work in a real Docker-backed run.

Test 3: Reproduced docker-mode compatibility regressions

Step 1 — Baseline: local mode returned HTTP 200 with a JSON array for GET /api/conversations?ids=<id> and raw 1 for /count.

Step 2 — PR docker mode: the same user-facing endpoints behaved differently:

curl "http://127.0.0.1:18080/api/conversations?ids=$CID"
# HTTP 500
# {"detail":"Internal Server Error","exception":"'list' object has no attribute 'get'"}

curl "http://127.0.0.1:18080/api/conversations/count"
# HTTP 200
# {"count":1}

This shows docker mode does not fully preserve the existing conversation endpoint contract promised in the PR description.

Issues Found

🟠 Issue: GET /api/conversations?ids=<conversation_id> returns 500 in docker mode instead of the local-mode JSON array response.
🟠 Issue: GET /api/conversations/count changes response shape from raw JSON number (1) to object ({"count":1}).

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Acceptable direction, but I found a few docker-mode issues that need attention before this is safe to merge: auth bypass, exposed inner servers, and REST/auth contract regressions.

This review was created by an AI agent (OpenHands) on behalf of the user.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🔴 HIGH — this is opt-in, but it changes request routing/authentication and starts network-reachable per-conversation servers.

VERDICT: ❌ Needs rework before merging.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26516796937

Six fixes for the per-conversation docker runtime, driven by reviewer findings on PR #3403: 1. Bind inner container ports to loopback only (-p 127.0.0.1:HOST:8000) so the per-conversation agent-servers can only be reached through the outer server's authenticated proxy. (R3311480573) 2. Authenticate the WebSocket bridge against the OUTER server's session keys before opening the upstream connection. Reuses the existing sockets.py helper (header / query / first-message auth), and the bridge no longer calls accept() a second time. (R3311480598) 3. Preserve the local GET /api/conversations?ids=... contract: route is batch-get-by-id, requires the ids query param, returns list[ConversationInfo | None]. Looks each id up in the registry and fetches from its container (None for missing). (R3311480555, R3311480542) 4. Preserve the local /api/conversations/count contract: returns a bare JSON integer (not {"count": N}), honors ?status= by forwarding the query to each inner container and summing their integers. (R3311480576, R3311480571) 5. ContainerManager.start() now returns (running, is_new). The POST route only tears down the container on inner 4xx / connection error when is_new=True, so a retried create against an existing conversation can no longer kill the live container. (R3311480570) 6. Workspace static-file routes mount under the workspace-cookie auth group in docker mode via a new docker_workspace_router. The workspace router is now registered before the header-only api_router so the more specific path wins; browser iframe/<img> embeds with the oh_workspace_session_key cookie continue to work. (R3311480585) Tests: * test_container_manager: assert loopback port binding; updated for the (running, is_new) return tuple, plus an explicit is_new=False assert on the idempotent second start. * test_docker_routers: new tests for batch-get-by-ids (incl. 422 on missing ids, null slots for unknown ids), bare-int /count contract, WS rejects wrong key, WS rejects missing first-message auth, WS accepts with valid outer key, POST retry preserves existing container on inner 4xx, fresh-create cleans up on inner 4xx, workspace route registered before the catch-all. Fake inner app reordered so /search and /count aren't shadowed by /{cid}. 22 docker_runtime tests pass; 144 tests in api / conversation / workspace / docker_runtime all green. Co-authored-by: openhands <openhands@all-hands.dev>

rbren · 2026-05-27T15:15:38Z

Pushed e7ec1a7 addressing all 8 review threads (now resolved). Summary:

Critical (security)

WS bridge now authenticates against the OUTER server's session_api_keys before opening the upstream connection (reuses sockets._accept_authenticated_websocket; no double-accept()). Wrong-key rejection happens pre-accept; missing-key falls through to first-message-auth and closes 4001 on timeout / bad frame.
Inner container ports now bind to 127.0.0.1 only — the per-conversation agent-server is only reachable through the outer auth proxy.

Important (API contracts)

GET /api/conversations?ids=... restored to local contract: required ids query, returns list[ConversationInfo | None] (not a page object), missing ids slot in as null. List/search aggregation lives only at /search.
GET /api/conversations/count restored to local contract: bare JSON integer, honors ?status= by forwarding the query to each container and summing.
ContainerManager.start() now returns (running, is_new); the POST handler only tears down on inner 4xx / connect-error when is_new=True, so a retried create against an existing conversation can't kill the live container.
Workspace static-file routes (/conversations/{cid}/workspace/...) are now served by a new docker_workspace_router mounted on workspace_api_router (cookie-or-header auth), and workspace_api_router is registered before the header-only api_router so the more specific path wins over the catch-all. Browser iframe/<img> embeds with oh_workspace_session_key continue to work in docker mode.

Tests
8 new regression tests (1 per review finding) plus updates to the existing ones. 22 docker_runtime tests + 144 tests across test_api / test_conversation_router / test_workspace_router / docker_runtime/ all pass; ruff + pyright clean.

This comment was posted by an AI agent (OpenHands) on behalf of the user.

…n-out for shared-disk read-only metadata * Drop the 339-line ContainerManager and its bespoke docker run wrapping; replace with DockerConversationRegistry (190 lines) that's a thin shell around DockerWorkspace. Image pulls, GPU, network, port allocation, log streaming, healthchecks, lifecycle cleanup are all delegated. * Add bind_host to DockerWorkspace so the outer can publish the inner agent-server on 127.0.0.1 only (defense-in-depth: only the outer reaches the inner; other hosts on the network can't bypass outer auth). * Replace the docker fan-out across containers (batch_get / count / search) with shared-disk metadata reads. Outer's ConversationService runs in a new read_only_metadata mode: skips lease acquisition and EventService startup; get / search / count / batch_get re-read meta.json / base_state.json off disk on every call so sub-container writes show up immediately. * Bind-mount layout: per-cid conversations/{cid_hex} is the only conversation dir each sub-container can see (the outer sees all of them and reads on-disk metadata). Settings/secrets dir is shared via OH_PERSISTENCE_DIR so cipher keys match. * Global per-host routers (bash/git/file/vscode/desktop/hooks/mcp/ skills/tools/llm) are reverse-proxied via a required ?cid=… query parameter — registered one route per prefix so the catch-all doesn't shadow /api/conversations, /api/settings, etc. * Auth: outer and inner share OH_SESSION_API_KEYS_0 via conversation_container_forward_env. The proxy synthesizes the X-Session-API-Key header from the shared workspace key when the inbound request authenticated via the workspace-session cookie (so iframe/<img> embeds still reach the inner static file server). * Drop the fan-out tests; add tests for the read-only mode + ?cid= routing. Net diff: -132 LoC across the package while adding new behaviour-level tests. Co-authored-by: openhands <openhands@all-hands.dev>

rbren · 2026-05-27T19:17:39Z

Pushed 151cd525 — simplification rewrite per the feedback that the previous design duplicated a lot of DockerWorkspace. Two structural changes:

1. Drop `ContainerManager`, use `DockerWorkspace` directly

docker_runtime/container_manager.py (339 lines of bespoke docker run wrapping, port allocation, healthchecks, log streaming, lifecycle cleanup, image management) is deleted and replaced with docker_runtime/registry.py (190 lines) which is a thin shell around DockerWorkspace. Everything ContainerManager did is something DockerWorkspace was already doing for its other use cases:

Concern	`ContainerManager`	`DockerWorkspace`
`docker run` wrapping	bespoke argv builder	yes
Free port allocation	bespoke	yes
Image pulls & cleanup	bespoke	yes (`cleanup_image`)
Network / GPU / platform	partial	yes
Volume mounts	bespoke	yes
Forwarded env	bespoke	yes (`forward_env` + `extra_env`)
Log streaming	bespoke	yes (`detach_logs`)
Healthcheck wait	bespoke `urlopen` loop	yes (`health_check_timeout`)
Lifecycle / cleanup	bespoke	yes (`cleanup`)

I added one small field to DockerWorkspace to cover the one capability that wasn't already there:

bind_host: str — host interface to publish on. Default "" keeps -p HOST_PORT:8000; setting "127.0.0.1" gives -p 127.0.0.1:HOST_PORT:8000. The docker registry pins this to 127.0.0.1 so only the outer agent-server can reach the inner — defense-in-depth around the proxy auth.

2. Drop fan-out across containers; read shared disk instead

Per the review pushback, fan-out was the wrong shape — it was N container hops for what's fundamentally a cheap directory walk. The outer's ConversationService now has a read_only_metadata mode that:

Skips EventService startup in __aenter__ (no leases acquired, no in-memory state, no lease-renewal task).
On every get / search / count / batch_get, re-walks conversations_path and reads meta.json + base_state.json straight off disk. Falls back to a synthesized state for conversations whose base_state.json hasn't been flushed yet.
Mutation methods aren't expected to be called (the docker proxy router intercepts them before they reach ConversationService).

Bind-mount layout is per-cid: each sub-container only sees its own conversations/{cid_hex} subdirectory. The outer sees all of them. The .openhands settings/secrets dir is shared so OH_SECRET_KEY round-trips correctly.

Other changes asked for in the review

?cid= for global routers (bash/git/file/vscode/desktop/hooks/mcp/skills/tools/llm): registered one specific route per prefix in docker_global_proxy_router so the catch-all doesn't shadow /api/conversations / /api/settings / etc. Missing ?cid= → clear 400 telling the client what they need to do.
X-Session-API-Key forwarding: outer and inner share the same OH_SESSION_API_KEYS_0 via conversation_container_forward_env (now includes that key in the default list). The proxy passes through whatever header the client sent; for cookie-authed workspace static files it synthesizes the header from workspace.api_key (read out of the outer's env) so the inner static file server is happy.
No outer-side services touched for the simpler approach. The outer still runs tmux / vscode / desktop / sockets / settings / profiles in-process — those just aren't conversation-scoped.

Verification

The new architecture still answers every API the user asked about:

Endpoint group	Where it runs in docker mode
`POST /api/conversations`	proxy → sub-container (spawns it first)
`GET /api/conversations[/count	/search
Per-cid mutations (`/run`, `/pause`, `/events`, …)	proxy → sub-container
Workspace static files (`/conversations/{cid}/workspace/…`)	proxy → sub-container, cookie-auth preserved
Global routers (`/bash`, `/git`, `/file`, …)	proxy → sub-container, requires `?cid=`
`WS /sockets/events/{cid}`	outer authenticates, then bridges to sub-container

Stats

13 files changed, 1083 insertions(+), 1215 deletions(-)

Net -132 LoC even though new tests were added. Locally:

tests/agent_server/test_conversation_service.py        80 passed (4 new read-only-mode tests)
tests/agent_server/test_conversation_router.py         69 passed (no changes)
tests/agent_server/docker_runtime/test_docker_routers  17 passed (rewritten for new registry)

ruff check, ruff format, and pyright all clean on the changed files.

This comment was created by an AI agent (OpenHands) on behalf of the PR author.

all-hands-bot reviewed May 27, 2026

View reviewed changes

Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated

Comment thread openhands-agent-server/openhands/agent_server/docker_runtime/routers.py Outdated

all-hands-bot reviewed May 27, 2026

View reviewed changes

rbren marked this pull request as draft May 27, 2026 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent-server): add docker runtime mode for per-conversation containers#3403

feat(agent-server): add docker runtime mode for per-conversation containers#3403
rbren wants to merge 3 commits into
mainfrom
feat/agent-server-docker-runtime

rbren commented May 27, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rbren commented May 27, 2026

Uh oh!

rbren commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rbren commented May 27, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

What changes

Endpoint coverage

Non-goals (intentionally)

Tests

Running it

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

⚠️ QA Report: PASS WITH ISSUES

Does this PR achieve its stated goal?

Test 1: Baseline local-mode API contract

Test 2: PR docker runtime core flow

Test 3: Reproduced docker-mode compatibility regressions

Issues Found

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rbren commented May 27, 2026

Uh oh!

rbren commented May 27, 2026

1. Drop ContainerManager, use DockerWorkspace directly

2. Drop fan-out across containers; read shared disk instead

Other changes asked for in the review

Verification

Stats

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rbren commented May 27, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

1. Drop `ContainerManager`, use `DockerWorkspace` directly