fix(agent-server): route sync-only agents through sync run() instead of threaded astep (B5)#3404
Conversation
…of threaded astep
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
Coverage Report •
|
||||||||||||||||||||
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Taste Rating: Good taste — the capability guard is simple and the regression test covers the intended sync-only-agent dispatch path.
No important code issues found. Minor note: the code/PR text still uses ACPAgent as a sync-only example, but current ACPAgent overrides astep; consider updating that wording to avoid future confusion.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — this changes agent-server run dispatch/threading behavior for custom agents. Per repo review policy, I’m leaving a COMMENT rather than approval so a human maintainer can decide after any desired lightweight eval validation.
Verified locally: uv run pytest tests/agent_server/test_event_service.py -q -k 'run_uses_executor_for_sync_only_agent or run_uses_executor_for_sync_only_conversation' passed (2 tests).
This review was generated by an AI agent (OpenHands) on behalf of the requester.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26516965175
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The EventService dispatch behavior now matches the PR goal: sync-only agents are routed through sync run() while native async agents still use arun().
Does this PR achieve its stated goal?
Yes. I reproduced the old behavior on origin/main: a LocalConversation-like object wrapping a sync-only AgentBase subclass was dispatched through arun() on the main event-loop thread. On the PR branch, the same functional exercise dispatched through sync run() in the executor, while a native async agent still dispatched through arun().
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully on the PR branch; baseline worktree synced with uv sync --dev |
| CI Status | ⏳ 19 successful, 0 failing, 7 pending, 6 skipped at review time |
| Functional Verification | ✅ Real EventService.run() + Conversation execution showed the intended before/after dispatch change |
Functional Verification
Test 1: Sync-only agent dispatch changes from arun() to sync run()
Step 1 — Reproduce baseline on origin/main:
Ran a Python script through uv run python -u - that created a real EventService, real SDK Conversation, and a sync-only AgentBase subclass whose step() pauses after one step. A recording wrapper around the conversation captured which entrypoint EventService selected.
Relevant output:
event_service_dispatch_calls= [('arun', 'MainThread')]
agent_step_threads= ['asyncio_0']
This confirms the reported baseline: EventService selected arun() for a conversation that had an async arun() method even though the agent only inherited the default async shim.
Step 2 — Apply the PR's changes:
Used the checked-out PR branch at commit 23c3ea96276ad7d5b9c4cf8836b1264782811cc7.
Step 3 — Re-run with the fix in place:
Ran the same functional script on the PR branch.
Relevant output:
event_service_dispatch_calls= [('run', 'asyncio_0')]
agent_step_threads= ['asyncio_0']
This confirms the fix: for a sync-only agent, EventService now dispatches through sync run() in the executor instead of calling arun() on the event loop.
Test 2: Native async agents still use arun()
Step 1 — Establish baseline on origin/main:
Ran the same EventService/Conversation exercise with an AgentBase subclass that overrides astep() and pauses from that async method.
Relevant output:
event_service_dispatch_calls= [('arun', 'MainThread')]
agent_astep_threads= ['MainThread']
This shows native async agents used the async path before the PR.
Step 2 — Apply the PR's changes:
Used the checked-out PR branch at commit 23c3ea96276ad7d5b9c4cf8836b1264782811cc7.
Step 3 — Re-run with the fix in place:
Ran the same native-async-agent probe on the PR branch.
Relevant output:
event_service_dispatch_calls= [('arun', 'MainThread')]
agent_astep_threads= ['MainThread']
This confirms the new guard does not regress the intended native async path.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review — fix(agent-server): route sync-only agents through sync run() instead of threaded astep (B5)
This is a well-targeted fix for a genuine correctness bug. The problem is clearly understood and the solution is minimal and precise: LocalConversation always overrides arun(), so the existing conversation-level guard was insufficient — it would route sync-only agents (e.g. ACPAgent) through the native async path, where arun() holds the state lock on the event-loop thread while step() runs in a worker thread. The new agent-level guard detects agents that rely on the default AgentBase.astep (which itself delegates to run_in_executor) and redirects them through the sync run() thread-pool path, avoiding the lock contention.
The change is correct, comments are thorough, and the test mirrors the established pattern of the companion test_run_uses_executor_for_sync_only_conv. Two minor suggestions below — neither blocks merging.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Taste Rating: Good taste — the dispatch guard is minimal, and the regression test covers the sync-only-agent routing behavior.
No code issues found. Because this changes agent-server execution/threading behavior for agent runs, repo guidance treats it as eval/benchmark-risk; I’m leaving a COMMENT rather than approving so a human maintainer can decide after any desired lightweight eval validation.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — targeted dispatch/threading behavior change for custom/sync-only agents, with focused regression coverage.
This review was generated by an AI agent (OpenHands) on behalf of the requester.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26517471983
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Exercised the EventService run path before and after the fix; the PR routes sync-only agents through sync run() again while preserving native async arun() for agents that override astep().
Does this PR achieve its stated goal?
Yes. On origin/main, a conversation with an async arun() and a sync-only AgentBase subclass was dispatched through arun() on MainThread, reproducing the threading/ordering behavior the PR aims to fix. On PR commit 23c3ea96, the same scenario dispatched through sync run() on executor thread asyncio_0, while an agent with a real astep() override still used arun() on MainThread.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv run created/used the project env and imported EventService/AgentBase successfully. |
| CI Status | 🟡 Most checks green; qa-changes was still in progress when checked. |
| Functional Verification | ✅ Before/after runtime exercise confirmed the intended dispatch change and async-agent preservation. |
Functional Verification
Test 1: Sync-only agents use sync run() instead of async arun()
Step 1 — Reproduce / establish baseline (without the fix):
Ran git switch --detach origin/main && PYTHONUNBUFFERED=1 OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_eventservice_dispatch.py.
Relevant output:
BASE_HEAD=481b7e01
sync_only_agent: caller=MainThread run_called=False run_thread=None arun_called=True arun_thread=MainThread agent_astep_is_default=True
native_async_agent: caller=MainThread run_called=False run_thread=None arun_called=True arun_thread=MainThread agent_astep_is_default=False
BASE_SCRIPT_STATUS=0
This confirms the bug exists on the base branch: even though the agent inherits the default AgentBase.astep (agent_astep_is_default=True), EventService.run() selected conversation arun() on MainThread instead of sync run().
Step 2 — Apply the PR's changes:
Checked out exact PR commit 23c3ea96276ad7d5b9c4cf8836b1264782811cc7.
Step 3 — Re-run with the fix in place:
Ran PYTHONUNBUFFERED=1 OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_eventservice_dispatch.py.
Relevant output:
PR_HEAD=23c3ea96
sync_only_agent: caller=MainThread run_called=True run_thread=asyncio_0 arun_called=False arun_thread=None agent_astep_is_default=True
native_async_agent: caller=MainThread run_called=False run_thread=None arun_called=True arun_thread=MainThread agent_astep_is_default=False
PR_SCRIPT_STATUS=0
This shows the fix works: the sync-only agent now goes through sync run() in the executor (asyncio_0) and no longer calls arun(). The related async-capable path is preserved: an agent with a real astep() override still uses arun() on MainThread.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review — fix(agent-server): route sync-only agents through sync run() instead of threaded astep (B5)
🟢 Good taste — minimal, surgical fix for a real threading correctness issue, with a focused regression test.
Summary
The bug is real and the diagnosis is accurate: LocalConversation unconditionally overrides arun(), so the existing type(conversation).arun is not BaseConversation.arun guard was always True for real conversations, routing every agent — including sync-only custom AgentBase subclasses — through the native async path. For agents that inherit the default AgentBase.astep (which just runs step() in a run_in_executor thread), this means arun() held the state lock on the event-loop thread while step() and its callbacks executed on a worker thread — a threading ordering change invisible to the agent author.
The fix is correct:
and type(conversation.agent).astep is not AgentBase.astepThis cleanly separates the two concerns: (1) does the conversation have a real async run path, and (2) does the agent have a genuinely async step implementation? Both must be true to justify running on the event loop. Short-circuit evaluation ensures conversation.agent is never accessed for conversations without a real arun() override.
Issues
Minor — comment example is stale
The inline comment at line 786 cites ACPAgent as an example of a sync-only agent that will be routed through run(). However, ACPAgent currently has a native async def astep() implementation (acp_agent.py, line 1703), so type(acpagent).astep is not AgentBase.astep evaluates to True — meaning it would still take the native async path. The comment example should be updated to an agent that actually only implements sync step(), or generalised to "a custom AgentBase subclass with no astep() override."
Risk Assessment
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — changes the execution/threading path for all agent-server conversations. The regression test covers the sync-only routing case and the CI suite passes, but dispatch behavior changes warrant human sign-off before merge.
VERDICT:
✅ Worth merging — fix is correct and well-tested. The stale ACPAgent example in the comment is a nit, not a blocker.
KEY INSIGHT:
The guard correctly decomposes the routing decision into two orthogonal capabilities (conversation async path + agent async step), where previously only the first was checked.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review — fix(agent-server): route sync-only agents through sync run() instead of threaded astep (B5)
🟢 Good taste — minimal, surgical fix for a real threading correctness issue, with a focused regression test.
Summary
The bug is real and the diagnosis is accurate: LocalConversation unconditionally overrides arun(), so the existing type(conversation).arun is not BaseConversation.arun guard was always True for real conversations, routing every agent — including sync-only custom AgentBase subclasses — through the native async path. For agents that inherit the default AgentBase.astep (which just runs step() in a run_in_executor thread), this means arun() held the state lock on the event-loop thread while step() and its callbacks executed on a worker thread — a threading ordering change invisible to the agent author.
The fix is correct:
and type(conversation.agent).astep is not AgentBase.astepThis cleanly separates the two concerns: (1) does the conversation have a real async run path, and (2) does the agent have a genuinely async step implementation? Both must be true to justify running on the event loop. Short-circuit evaluation ensures conversation.agent is never accessed for conversations without a real arun() override.
Issues
Minor — comment example is stale
The inline comment at line 786 cites ACPAgent as an example of a sync-only agent that will be routed through run(). However, ACPAgent currently has a native async def astep() implementation (acp_agent.py, line 1703), so type(acpagent).astep is not AgentBase.astep evaluates to True — meaning it would still take the native async path. The comment example should be updated to an agent that actually only implements sync step(), or generalised to "a custom AgentBase subclass with no astep() override."
Risk Assessment
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — changes the execution/threading path for all agent-server conversations. The regression test covers the sync-only routing case and the CI suite passes, but dispatch behavior changes warrant human sign-off before merge.
VERDICT:
✅ Worth merging — fix is correct and well-tested. The stale ACPAgent example in the comment is a nit, not a blocker.
KEY INSIGHT:
The guard correctly decomposes the routing decision into two orthogonal capabilities (conversation async path + agent async step), where previously only the first was checked.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
|
@OpenHands address the comments of all-hands-bot |
|
I'm on it! VascoSch92 can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
|
OpenHands encountered an error: Request timeout after 30 seconds to https://wytudswhovefpekw.prod-runtime.all-hands.dev/api/conversations/6ba3cba6-2270-4707-8e74-cf7e35ae55ec/ask_agent See the conversation for more information. |
Why
EventService.run() selected the native async path (arun()) based solely on whether the conversation overrides BaseConversation.arun. Since LocalConversation always overrides arun(), every server run took the async path, including agents that only implement sync step() and inherit the default AgentBase.astep, which ships step() to a worker thread via run_in_executor.
The result: for ACPAgent and other custom AgentBase subclasses, sync step() (and its callbacks) ran on a worker thread while arun() held the state lock on the event-loop thread — silently changing the threading/ordering model those agents were written and tested against. This is item B5 from the async-LLM follow-up tracker (#3341).
This is not a deadlock (step() never re-acquires the state lock), but it is a behavioral change that can surface as hard-to-reproduce ordering/threading issues for ACP and custom agents.
The fix is to gate the async path on the agent's capability as well as the conversation's:
Agents that genuinely override astep (e.g. Agent) keep the native async path; agents that only implement sync step() fall back to running run() in the thread pool as before. The guard short-circuits before reaching conversation.agent, so the existing MagicMock-based tests are unaffected.
Issue Number
#3341
How to Test
Type
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:c643a77-pythonRun
All tags pushed for this build
About Multi-Architecture Support
c643a77-python) is a multi-arch manifest supporting both amd64 and arm64c643a77-python-amd64) are also available if needed