Skip to content

fix(acp): keep current_model_id honest on resume and unapplied overrides#3407

Merged
simonrosenberg merged 3 commits into
mainfrom
fix-acp-model-id-accuracy
May 27, 2026
Merged

fix(acp): keep current_model_id honest on resume and unapplied overrides#3407
simonrosenberg merged 3 commits into
mainfrom
fix-acp-model-id-accuracy

Conversation

@simonrosenberg
Copy link
Copy Markdown
Collaborator

@simonrosenberg simonrosenberg commented May 27, 2026

Summary

Follow-up to #3347, which surfaced current_model_id / available_models from ACP session responses. The review on #3347 raised two 🟠 Important correctness issues, but it landed ~3 minutes before the PR merged and went in unaddressed. Both let ConversationInfo.current_model_id (the picker/chip) claim a model the ACP server isn't actually running.

1. Stale acp_current_model_id on resume (asymmetric persistence gating)

init_state gated the two persisted fields differently:

  • acp_available_models — cleared when the server reported a models block (_available_models is not None), even an empty one.
  • acp_current_model_id — kept whenever the value was non-None, with no "did the server report?" check.

So on a true resume where the server reports models:{currentModelId:"", availableModels:[]} (both map to None/[] via _extract_session_models), the list was cleared but the stale id survived — rendering a chip that points at a model absent from the now-empty picker.

Fix: the clear branch now mirrors the list gating:

elif not truly_resumed or self._available_models is not None:
    new_agent_state.pop("acp_current_model_id", None)

2. Unapplied override surfaced as the current model

current_model_id = self.acp_model or reported_model_id trusted the override even on paths where it never reached the server:

  • a fresh unknown/custom provider — gets neither session _meta nor set_session_model (both provider-gated), so acp_model is set on the agent but the server runs its own default;
  • a resume whose set_session_model the server rejected_reapply_session_model_on_resume swallows the ACPRequestError and logs "the live session may run on the server default", yet the surface still claimed the override.

Fix: _maybe_set_session_model and _reapply_session_model_on_resume now return whether the override was actually applied; resolution prefers acp_model only then, otherwise surfaces the server-reported id (or None). The happy override path (claude _meta, codex/gemini set_session_model) is unchanged.

Verification

  • Ran the SDK-pinned @google/gemini-cli@0.38.0 through a real ACPAgent: happy-path surfacing is correct (current_model_id, 8 available_models); confirmed servers do report models blocks on new_session.
  • New tests cover both fixes. The four bug-path tests fail on the pre-fix code and pass after — the applied-override guard test passes on both (intentional non-regression check).
  • tests/sdk/agent/test_acp_agent.py: 238 passed. ruff check / ruff format clean; pyright 0 errors.

🤖 Generated with Claude Code


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:df86f87-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-df86f87-python \
  ghcr.io/openhands/agent-server:df86f87-python

All tags pushed for this build

ghcr.io/openhands/agent-server:df86f87-golang-amd64
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-golang-amd64
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-golang-amd64
ghcr.io/openhands/agent-server:df86f87-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:df86f87-golang-arm64
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-golang-arm64
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-golang-arm64
ghcr.io/openhands/agent-server:df86f87-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:df86f87-java-amd64
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-java-amd64
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-java-amd64
ghcr.io/openhands/agent-server:df86f87-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:df86f87-java-arm64
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-java-arm64
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-java-arm64
ghcr.io/openhands/agent-server:df86f87-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:df86f87-python-amd64
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-python-amd64
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-python-amd64
ghcr.io/openhands/agent-server:df86f87-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:df86f87-python-arm64
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-python-arm64
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-python-arm64
ghcr.io/openhands/agent-server:df86f87-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:df86f87-golang
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-golang
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-golang
ghcr.io/openhands/agent-server:df86f87-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:df86f87-java
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-java
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-java
ghcr.io/openhands/agent-server:df86f87-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:df86f87-python
ghcr.io/openhands/agent-server:df86f87e0045fe5d051eaf4c94432e0e264778d0-python
ghcr.io/openhands/agent-server:fix-acp-model-id-accuracy-python
ghcr.io/openhands/agent-server:df86f87-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., df86f87-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., df86f87-python-amd64) are also available if needed

Addresses two correctness gaps in the model-state surfacing from #3347 that
were flagged in review but went in unaddressed (review landed ~3 min before
merge). Both make `ConversationInfo.current_model_id` (the picker/chip) lie
about the model the ACP server is actually running.

1. Stale id on resume (asymmetric persistence gating). `init_state` cleared
   the persisted `acp_available_models` when the server reported a `models`
   block with no usable models, but gated `acp_current_model_id` only on the
   value being non-None — so a true resume reporting `currentModelId: ""` /
   `availableModels: []` cleared the list while keeping the old id, leaving a
   chip pointing at a model absent from the (now-empty) picker. The clear
   branch now mirrors the list gating (`_available_models is not None`).

2. Unapplied override surfaced as current. `current_model_id = self.acp_model
   or reported_model_id` trusted the override even on paths where it never
   reached the server: a fresh unknown/custom provider (no `_meta`, no
   `set_session_model`), or a resume whose `set_session_model` the server
   rejected (swallowed `ACPRequestError`). `_maybe_set_session_model` and
   `_reapply_session_model_on_resume` now return whether the override was
   actually applied, and the resolution prefers `acp_model` only then —
   otherwise it surfaces the server-reported id (or None).

Tests: extend the helper unit tests with the new bool return, strengthen the
explicit-empty-models resume test to assert the id clears too, and add
integration coverage for unknown-provider / rejected-resume / applied-override
resolution. All four bug-path tests fail on the pre-fix code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable overall, but I found one correctness gap that can still let an unapplied ACP override leak into ConversationInfo on resume when load_session omits models. See inline.

Risk: 🟡 Medium (ACP model metadata correctness; low security/safety risk).

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26521857355

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   conversation_service.py60411580%144–145, 154, 180–181, 185–186, 191, 359–360, 391, 394, 401–407, 434, 440, 536, 542, 547, 553, 561–562, 571–574, 583, 595, 603, 626–627, 666, 695–699, 701–702, 705–706, 716, 728–733, 831, 838–842, 845–846, 850–854, 857–858, 862–866, 869–870, 892–893, 897–898, 900–902, 904, 907, 915–919, 922, 929–934, 936–937, 951, 961, 965, 967–968, 973–974, 980–981, 989, 1004–1005, 1035, 1050, 1078, 1372, 1375
openhands-sdk/openhands/sdk/agent
   acp_agent.py7836292%458–460, 590–591, 624, 626, 630, 634, 642, 705–706, 711, 778, 962, 965–966, 983–984, 1013, 1018, 1094, 1104, 1137–1140, 1435–1438, 1442–1444, 1447–1451, 1453, 1682, 2035–2037, 2074, 2078–2079, 2082, 2090–2092, 2094, 2096, 2100, 2103, 2112–2114, 2116, 2135, 2336–2337
TOTAL28750653377% 

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

The ACPAgent model picker/chip now stays honest across stale resume state and unapplied/rejected overrides, while the applied override path still works.

Does this PR achieve its stated goal?

Yes. I exercised the SDK as a user would by starting an ACPAgent against a real stdio ACP server shim and comparing base main against this PR. On main, the stale persisted model id survived an explicit empty models resume response, unknown-provider fresh overrides were shown even though no override reached the server, and rejected resume overrides still appeared as current. On the PR branch, those same flows surface the server-reported model or None, and the known-provider override path still reports the applied override.

Phase Result
Environment Setup ✅ Existing uv project environment was usable; SDK/ACP runtime paths executed successfully.
CI Status ✅ 32 checks successful, 4 skipped; only this qa-changes run was still in progress when checked.
Functional Verification ✅ Reproduced the base bugs and verified the PR fixes them with real ACP stdio runtime execution.
Functional Verification

I used temporary scripts outside the repo (/tmp/acp_fake_server.py, /tmp/acp_qa_client.py) to launch ACPAgent with acp_command=[sys.executable] and acp_args=[SERVER], then drove agent.init_state(...) against an actual ACP stdio subprocess. The fake server recorded real ACP calls (initialize, new_session, load_session, set_session_model) and returned controlled models blocks.

Test 1: Resume with explicit empty models clears stale current id

Step 1 — Reproduce on base main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py stale_resume_empty_models:

base-stale_resume_empty_models.json
scenario: stale_resume_empty_models
agent: some-custom-acp
current_model_id: None
available_model_ids: []
persisted acp_current_model_id: stale-model
persisted acp_available_model_ids: []
server calls: [('load_session', None)]

This confirms the bug: the server reported an explicit empty model state on resume, but persisted acp_current_model_id remained stale-model while the picker list was cleared.

Step 2 — Apply PR changes:
Checked out fix-acp-model-id-accuracy at 5d0d6313.

Step 3 — Re-run with the fix:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py stale_resume_empty_models:

pr-stale_resume_empty_models.json
scenario: stale_resume_empty_models
agent: some-custom-acp
current_model_id: None
available_model_ids: []
persisted acp_current_model_id: <missing>
persisted acp_available_model_ids: []
server calls: [('load_session', None)]

This shows the stale persisted id is now removed in lock-step with the explicit empty model list.

Test 2: Fresh unknown-provider override is not surfaced unless applied

Step 1 — Reproduce on base main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py unknown_fresh_override:

base-unknown_fresh_override.json
scenario: unknown_fresh_override
agent: some-custom-acp
current_model_id: caller-model
available_model_ids: ['server-model']
persisted acp_current_model_id: caller-model
persisted acp_available_model_ids: ['server-model']
server calls: [('new_session', None)]

This confirms the bug: no set_session_model call occurred for the unknown provider, but the SDK still claimed the caller override was current.

Step 2 — Apply PR changes:
Checked out fix-acp-model-id-accuracy at 5d0d6313.

Step 3 — Re-run with the fix:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py unknown_fresh_override:

pr-unknown_fresh_override.json
scenario: unknown_fresh_override
agent: some-custom-acp
current_model_id: server-model
available_model_ids: ['server-model']
persisted acp_current_model_id: server-model
persisted acp_available_model_ids: ['server-model']
server calls: [('new_session', None)]

This shows the SDK now surfaces the server-reported model when the caller override never reached the server.

Test 3: Rejected resume override falls back to server model

Step 1 — Reproduce on base main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py resume_rejected_override:

base-resume_rejected_override.json
scenario: resume_rejected_override
agent: codex-acp
current_model_id: caller-model
available_model_ids: ['server-resumed']
persisted acp_current_model_id: caller-model
persisted acp_available_model_ids: ['server-resumed']
server calls: [('load_session', None), ('set_session_model', 'caller-model')]

The fake server rejected set_session_model with method not found; base still reported caller-model, confirming the misleading current-model behavior.

Step 2 — Apply PR changes:
Checked out fix-acp-model-id-accuracy at 5d0d6313.

Step 3 — Re-run with the fix:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py resume_rejected_override:

pr-resume_rejected_override.json
scenario: resume_rejected_override
agent: codex-acp
current_model_id: server-resumed
available_model_ids: ['server-resumed']
persisted acp_current_model_id: server-resumed
persisted acp_available_model_ids: ['server-resumed']
server calls: [('load_session', None), ('set_session_model', 'caller-model')]

This confirms that after a rejected resume reapply, the SDK no longer claims the unapplied override as current.

Test 4: Applied known-provider override still wins

Step 1 — Establish baseline on base main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py known_fresh_override:

base-known_fresh_override.json
scenario: known_fresh_override
agent: codex-acp
current_model_id: caller-model
server calls: [('new_session', None), ('set_session_model', 'caller-model')]

This establishes the intended happy path: the known provider receives set_session_model, and the override is shown as current.

Step 2 — Apply PR changes:
Checked out fix-acp-model-id-accuracy at 5d0d6313.

Step 3 — Re-run with the fix:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/acp_qa_client.py known_fresh_override:

pr-known_fresh_override.json
scenario: known_fresh_override
agent: codex-acp
current_model_id: caller-model
server calls: [('new_session', None), ('set_session_model', 'caller-model')]

This confirms the fix did not regress the applied override path.

Issues Found

None.

Final verdict: PASS.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

…lock

Addresses review feedback on this PR: the first commit still let an unapplied
override leak through on one path — a true resume where `load_session` omits
the (UNSTABLE) `models` block AND the server rejects `set_session_model`. There
`_reapply_session_model_on_resume` returns False, `_current_model_id` is None,
but `truly_resumed` is true and `_available_models` is None, so the persisted
`acp_current_model_id` (which named the rejected override) was preserved — and
even after clearing it, `ConversationInfo`'s final `acp_model` fallback would
re-assert the override for a live agent.

Two coordinated changes:
- ACPAgent now carries an explicit `_model_override_applied` signal out of
  `_init`. `init_state` clears the stale persisted id when an override was
  attempted but not applied (`acp_model` set yet `_model_override_applied`
  False), in addition to the existing replaced-session / reported-empty cases.
- `_compose_conversation_info` gates the `acp_model` fallback on the agent not
  being live+initialized. Once `init_state` has fired, `current_model_id` is
  authoritative (including None); the `acp_model` fallback is only for cold
  reads. The persisted-id fallback is unchanged (kept honest by `init_state`),
  so the legitimate preserve-on-resume case still surfaces the last-known model.

Tests: regression for rejected `set_session_model` + absent `models` (id
cleared, override-not-applied), plus live-agent-no-fallback and cold-read-
fallback coverage for `_compose_conversation_info`. Both bug-path tests fail on
the pre-fix code. 255 passed; ruff + pyright clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

🔍 Review in progress…

We are performing the review through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Collaborator

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@simonrosenberg simonrosenberg self-assigned this May 27, 2026
@simonrosenberg simonrosenberg added the acp About ACP label May 27, 2026
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the ACP model-id accuracy fixes with a real ACP stdio subprocess and the agent-server ConversationInfo composition path; the PR corrects the stale/unapplied model reporting cases I reproduced on main.

Does this PR achieve its stated goal?

Yes. On main, an unknown/custom ACP provider with acp_model=caller-model reported agent_current_model_id: caller-model even though no set_session_model call was made and the server reported server-model; with this PR the same run reports server-model. On resume, the PR also clears a stale persisted current id when the server reports an empty models block, and falls back to the server-reported model when reapplying an override is rejected.

Phase Result
Environment Setup make build completed successfully and installed the uv environment.
CI Status 🟡 Most checks were green when observed; a few Build & Push arm64 jobs plus automation review/QA checks were still in progress.
Functional Verification ✅ Reproduced the bug on origin/main, then verified the fixed behavior at aa8f24541f42a2c5e51b37628509e24159592fbc.
Functional Verification

Test 1: ACPAgent model state via real ACP stdio subprocess

Step 1 — Reproduce / establish baseline (without the fix):
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_driver.py.

Relevant output:

[
  {
    "scenario": "unknown_fresh",
    "agent_current_model_id": "caller-model",
    "agent_available_models": ["server-model"],
    "persisted_current_model_id": "caller-model",
    "set_session_model_calls": []
  },
  {
    "scenario": "resume_empty_models",
    "agent_current_model_id": null,
    "agent_available_models": [],
    "persisted_current_model_id": "stale-model",
    "persisted_available_models": []
  },
  {
    "scenario": "resume_reject_with_reported_model",
    "agent_current_model_id": "caller-model",
    "agent_available_models": ["server-resumed"],
    "persisted_current_model_id": "caller-model",
    "set_session_model_calls": [{"model_id": "caller-model", "session_id": "stored-session"}]
  }
]

This confirms both reported bugs on the base branch: the fresh unknown provider advertised an override that never reached the server, and resume paths retained stale/incorrect current ids despite the server reporting different/no model state.

Step 2 — Apply the PR's changes:
Checked out aa8f24541f42a2c5e51b37628509e24159592fbc.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_acp_driver.py.

Relevant output:

[
  {
    "scenario": "unknown_fresh",
    "agent_current_model_id": "server-model",
    "agent_available_models": ["server-model"],
    "persisted_current_model_id": "server-model",
    "set_session_model_calls": []
  },
  {
    "scenario": "resume_empty_models",
    "agent_current_model_id": null,
    "agent_available_models": [],
    "persisted_current_model_id": null,
    "persisted_available_models": []
  },
  {
    "scenario": "resume_reject_with_reported_model",
    "agent_current_model_id": "server-resumed",
    "agent_available_models": ["server-resumed"],
    "persisted_current_model_id": "server-resumed",
    "set_session_model_calls": [{"model_id": "caller-model", "session_id": "stored-session"}]
  }
]

This shows the ACPAgent now reports the model the live server actually claims, clears the stale persisted id alongside an explicitly empty models block, and stops surfacing an override after rejected reapply.

Test 2: Agent-server ConversationInfo current_model_id fallback

Step 1 — Reproduce / establish baseline (without the fix):
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_conversation_info_driver.py.

Output:

{
  "cold_read_current_model_id": "caller-model",
  "live_initialized_unapplied_override_current_model_id": "caller-model"
}

This shows the server-side ConversationInfo composition still fell back to acp_model for a live initialized ACPAgent whose resolved current model was None.

Step 2 — Apply the PR's changes:
Checked out aa8f24541f42a2c5e51b37628509e24159592fbc.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_conversation_info_driver.py.

Output:

{
  "cold_read_current_model_id": "caller-model",
  "live_initialized_unapplied_override_current_model_id": null
}

This confirms ConversationInfo no longer reasserts an unapplied live override, while preserving the intended cold-read fallback.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: fix(acp): keep current_model_id honest on resume and unapplied overrides

🟢 Good taste — This is a well-scoped, correctness-critical fix. The two bugs described in the PR are real, the root cause analysis is sharp, and the solution is minimal.


Summary

This PR addresses two distinct paths where ConversationInfo.current_model_id could advertise a model the live ACP session wasn't actually running:

  1. Stale id on resumeinit_state now treats acp_current_model_id and acp_available_models with symmetric clearing logic, adding override_attempted_not_applied as a third clear condition. Previously only the list was cleared when the server reported empty models:{}, leaving the id to silently survive.
  2. Unapplied override surfaced_maybe_set_session_model and _reapply_session_model_on_resume now return bool, threading _model_override_applied back through _start_acp_server so current_model_id resolution can prefer the override only when it actually reached the server.

Both fixes are defense-in-depth: the init_state persistence cleanup and the conversation_service.py agent_initialized gate together ensure neither the live attr nor the persisted hint can misrepresent the live model.


Observations

bool(session_meta) is safe as an "applied via _meta" signal (acp_agent.py, line 1528). build_session_model_meta returns {} when acp_model is None, when the provider is unknown, or when the provider uses the set_session_model protocol instead of _meta. So bool(session_meta) == True exclusively means "this is a _meta-based provider (e.g. claude) AND a model override was requested AND it was embedded in the session kwargs" — the semantics are tight.

_initialized is reliably set at acp_agent.py:1151 after _start_acp_server completes and before the agent_state update, so getattr(agent, "_initialized", False) in conversation_service.py correctly distinguishes live agents from cold-read deserialized ones.

Asymmetry between the two module functions (pre-existing, not introduced here): _maybe_set_session_model guards unknown providers with an early return, so no protocol call is made on a fresh session. _reapply_session_model_on_resume falls through to the try block for unknown providers, making an opportunistic set_session_model attempt on resume. This is intentional — resumes give unknown servers a chance to accept the protocol — but with both functions now returning bool and feeding _model_override_applied, the difference is more prominent. The return-value semantics are correct for each path; just worth keeping in mind when adding new providers.

override_attempted_not_applied placement (acp_agent.py, lines 1216-1218): computed after _start_acp_server assigns all PrivateAttrs (_model_override_applied, _current_model_id, _available_models), so it reads fresh state. No ordering issue.


Testing

The four new test cases are solid real-path tests (not mock-only assertions):

  • test_resume_loads_models_clears_stale_current_id — server reports empty currentModelId, stale id must be dropped.
  • test_resume_rejected_override_with_absent_models_clears_stale_id — the case that would slip through both the not truly_resumed and _available_models is not None guards without the new override_attempted_not_applied condition. Validates the exact gap being fixed.
  • test_unknown_provider_surfaces_server_model_not_unapplied_override — guards the fresh-unknown-provider path.
  • test_resume_rejected_override_surfaces_server_model — guards the resume-rejection path.

The non-regression test_known_provider_surfaces_applied_override (passes before and after) is a nice explicit guard that the happy path is not disturbed.

The 238-passed baseline plus the ruff/pyright clean run in the PR description closes the verification loop.


[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW — targeted internal state-resolution fix, no public API surface changes, purely corrects what ConversationInfo.current_model_id reports. The happy override path (claude _meta, codex/gemini set_session_model) is explicitly tested and unchanged.

VERDICT:
Worth merging — Core logic is sound, test coverage is thorough, no blocking issues.

KEY INSIGHT:
Making both helper functions return bool is the cleanest possible interface: it converts a "did the server actually get this?" question from an implicit assumption into an explicit, testable contract.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread openhands-sdk/openhands/sdk/agent/acp_agent.py
@simonrosenberg simonrosenberg enabled auto-merge (squash) May 27, 2026 16:18
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — the main direction is good, but I found two correctness holes where current_model_id can still claim an unverified/cleared override.

Risk: 🟡 Medium (ACP model metadata correctness; no security impact).

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26522977005

getattr(agent, "current_model_id", None)
or agent_state.get("acp_current_model_id")
or getattr(agent, "acp_model", None)
or (None if agent_initialized else getattr(agent, "acp_model", None))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: This fallback can undo the new stale-id clearing after an agent-server restart. If ACPAgent.init_state cleared acp_current_model_id because an override was rejected/unapplied, the persisted state still has acp_session_id and the serialized agent still has acp_model; after restart the PrivateAttrs reset and _initialized is false, so this cold path falls back to acp_model and re-advertises the model that was deliberately cleared. Please gate the acp_model fallback to pre-session/no persisted ACP session, or persist an explicit applied/tombstone flag so a cleared current id stays cleared across cold reads.

# reported in ``models.currentModelId`` (``None`` for older agents
# that don't surface the field). Trusting ``acp_model`` on paths
# where it never reached the server would mislabel the chip/picker.
current_model_id = (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: For the session _meta path (Claude), new_session has already returned models.currentModelId after processing the override, but this still reports self.acp_model whenever metadata was sent. If the server normalizes the requested id, falls back without failing, or reports an opaque id that matches availableModels, the chip becomes less honest than the server response. Prefer reported_model_id when present for the _meta path, falling back to self.acp_model only when the server omitted model state; the protocol-call path can keep using self.acp_model because set_session_model happens after the new_session response.

@simonrosenberg simonrosenberg merged commit 27f9462 into main May 27, 2026
31 checks passed
@simonrosenberg simonrosenberg deleted the fix-acp-model-id-accuracy branch May 27, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acp About ACP

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants