Skip to content

feat(agent): AgentRunner runtime dispatcher (Agent SDK migration, phase 2)#30

Draft
PalmPalm7 wants to merge 1 commit into
rhpds:mainfrom
PalmPalm7:migration/agent-runner
Draft

feat(agent): AgentRunner runtime dispatcher (Agent SDK migration, phase 2)#30
PalmPalm7 wants to merge 1 commit into
rhpds:mainfrom
PalmPalm7:migration/agent-runner

Conversation

@PalmPalm7

Copy link
Copy Markdown
Contributor

What

Adds src/agent/runner.py — an AgentRunner that routes a sub-agent task to either the legacy Anthropic tool-use loop (src.agent.agents.run_sub_agent) or the Claude Agent SDK adapter (src/llm/AgentSdkClient, #24), based on the existing agent.runtime flag (legacy|sdk, default legacy). Both paths return the same structured result dict.

Why

Phase-2 foundation. The orchestrator currently calls run_sub_agent directly; this introduces a single dispatch seam so a sub-agent can be moved onto the SDK behind the flag without touching the request path. The SDK branch surfaces token/cost/cache usage under data.usage, which the Phase-2 cost benchmark compares against legacy.

Scope / safety

  • Additive and dormant — nothing in the request path imports it yet (mirrors how feat(llm): add Claude Agent SDK adapter behind agent.runtime flag (migration, phase 1) #24 shipped behind the flag). A follow-up PR wires the Icinga sub-agent to dispatch through it.
  • With the default legacy runtime, run_sub_agent is a transparent pass-through → zero behavior change.
  • The SDK branch is intentionally minimal here (agent system prompt → complete() → normalized result); per-agent skill + tool wiring lands in the Icinga PR.

How to test

pytest tests/test_agent_runner.py -q     # 11 tests

Result (local gate)

  • black --check ✓ · ruff check ✓ · mypy ✓ (clean on both files)
  • pytest tests/test_agent_runner.py11 passed
  • Full suite → 98 passed, no regressions

Part of the Agent SDK migration (Phase 1: #23 skills loader, #24 SDK adapter, #25 Skills UI — all merged). Plan: artifacts/parsec-agent-sdk-migration-plan.md.

Route a sub-agent task to the legacy Anthropic loop or the Claude Agent
SDK adapter based on agent.runtime (legacy|sdk, default legacy), returning
the same structured result dict either way.

Additive and dormant: nothing in the request path imports it yet (mirrors
how the rhpds#24 adapter shipped behind the flag). A follow-up PR wires the
Icinga sub-agent to dispatch through it. With the default legacy runtime
it is a transparent pass-through to src.agent.agents.run_sub_agent — zero
behavior change. The SDK branch surfaces token/cost/cache usage under
data.usage, which the Phase-2 cost benchmark compares against legacy.

11 tests: runtime resolution, legacy pass-through, SDK result
normalization, and SDK-unavailable error handling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@PalmPalm7

Copy link
Copy Markdown
Contributor Author

Test results.

  • Local gate: black ✓ · ruff ✓ · mypy ✓ · pytest tests/test_agent_runner.py11 passed; full suite 98 passed, no regressions.
  • Upstream CI: quality-gates + docker-build + ci-status all green.
  • The runner's SDK branch is exercised end-to-end by the Icinga cost A/B on a personal cluster; that run is pending (the NERC cluster is currently at pod/disk capacity) and will be linked from feat(agent): Icinga sub-agent on the Agent SDK — skill + profile (migration, phase 2 pilot) #32.

@rut31337

rut31337 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Code Review — PR #30 (Draft)

Scope: 2 files, +402 lines — AgentRunner runtime dispatcher
Effort: High

Findings (4)

1. SDK path omits today's date from system prompt (src/agent/runner.py:108)
Legacy path appends "Today's date is {today}." to the system prompt. SDK path passes the raw prompt without it. Date-sensitive queries ("costs this month") will produce different results between runtimes, undermining the Phase-2 cost benchmark.

2. SDK path does not forward conversation_history (src/agent/runner.py:108)
Legacy path calls _extract_user_context(conversation_history) and appends it to the user message. SDK path ignores conversation_history entirely. Multi-turn follow-ups fail silently.

3. Missing catch-all exception handler in SDK path (src/agent/runner.py:106)
_run_via_sdk only catches AgentSdkUnavailableError. Any other exception from complete() (network errors, SDK bugs) propagates as an unstructured 500 instead of returning the graceful error dict the orchestrator expects.

4. Result dict shape mismatch between runtimes (src/agent/runner.py:153)
SDK results always include error, tool_errors, rounds_used keys. Legacy omits them in some return paths. Callers checking 'error' in result get inconsistent behavior. Consider normalizing both paths to the same schema.

Cross-PR note

The conversation_history gap also affects PR #32 (Icinga SDK pilot). Fixing it here would resolve it for all downstream PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants