feat(llm): add Claude Agent SDK adapter behind agent.runtime feature flag by PalmPalm7 · Pull Request #2 · PalmPalm7/parsec

PalmPalm7 · 2026-05-19T09:27:21Z

Summary

Adds src.llm — a thin async adapter around claude_agent_sdk.query() behind a new agent.runtime: legacy|sdk feature flag. Defaults to legacy so existing deploys are unchanged. The claude_agent_sdk import is lazy — module is importable without the SDK installed; only .complete() triggers the import and raises AgentSdkUnavailableError with an install hint if missing.

Behavior change: none. Adapter is unused until a future PR wires it into the orchestrator. This PR is purely scaffolding so we can iterate on the SDK surface in isolation.

Sibling PR #1 (migration/skills-loader) is independent — they can land in either order. See artifacts/parsec-agent-sdk-migration-plan.md for the migration plan.

What's in the box

src/llm/__init__.py — public API
src/llm/agent_sdk_client.py — AgentSdkClient.complete() aggregates the SDK's streamed AssistantMessage / UserMessage / ResultMessage flow into a single SdkResult (text + tool invocations + usage)
src/llm/runtime.py — get_runtime(config) helper for the new flag
src/llm/types.py — SdkResult, SdkUsage dataclasses
config/config.yaml — new agent: section with runtime flag and optional SDK overrides
tests/test_agent_sdk_client.py — 18 passing tests (fake SDK injected via sys.modules)

Why lazy import

Keeping claude-agent-sdk out of required deps means:

Existing deploys don't pull Node.js + Claude Code CLI binary unless the operator opts in
CI quality-gates run without the SDK
Phase 1 of the migration stays purely additive

When Phase 2 wires the adapter into the orchestrator, we'll add claude-agent-sdk to an optional [sdk] dep set.

Usage shape (for reviewers)

from src.llm import AgentSdkClient, get_runtime
from src.config import get_config

cfg = get_config()
if get_runtime(cfg) == \"sdk\":
    client = AgentSdkClient.from_config(cfg)
    result = await client.complete(
        prompt=\"Investigate cost spike for account 12345\",
        system=\"You are Parsec, a cloud cost investigator.\",
        skills=[\"cost-anomaly-triage\"],   # whitelist from PR #1's loader
        allowed_tools=[\"Read\", \"Bash\", \"mcp__reporting__*\"],
    )
    print(result.text, result.usage.total_cost_usd)

Test plan

pytest tests/test_agent_sdk_client.py — 18/18 pass
ruff check src/llm tests/test_agent_sdk_client.py — clean
black --check src/llm tests/test_agent_sdk_client.py — clean
Module imports cleanly with claude_agent_sdk NOT installed
complete() raises AgentSdkUnavailableError (not a vanilla ImportError) when SDK is missing
Smoke test against a real Anthropic API key once pip install claude-agent-sdk is run in a dev venv

🤖 Generated with Claude Code

…flag Introduces src.llm package with: - AgentSdkClient: thin async wrapper around claude_agent_sdk.query() that aggregates the streamed AssistantMessage/UserMessage/ResultMessage flow into a single SdkResult (text + tool invocations + usage) - get_runtime() helper reading the new agent.runtime config flag (legacy | sdk), defaulting to legacy so existing deploys are unchanged - Lazy import of claude_agent_sdk: the adapter file is importable without the package installed; calls raise AgentSdkUnavailableError with install hint instead Tests inject a fake sdk module via sys.modules so coverage exercises the message-stream aggregation, options pass-through, error capture, and config defaults without requiring the real SDK in CI. No orchestrator changes — additive only. Phase 2 of the migration will wire AgentSdkClient into a new AgentRunner dispatcher that reads agent.runtime. See artifacts/parsec-agent-sdk-migration-plan.md.

Add admin ability to view all users' saved conversations: - All Users toggle in sidebar (admin only) - Download all conversations as JSON export - GET /api/conversations?all_users=true with admin guard - GET /api/conversations/export endpoint for bulk download - Admins can read (not delete) other users' conversations Add OpenShift group-based admin access: - is_admin_user_async() checks email list + OpenShift groups - learnings.admin_groups config (ace-octo-team) - ace-octo-team added to auth.allowed_groups Incorporate production learnings into agent prompts: - shared_context.md: mandatory schema-first rule, column name gotchas - aap2_agent.md: get_job_log tip for specific job IDs - icinga_agent.md: downtime check-first, OCP Nodes Ready pattern

Real-SDK regression caught during live smoke test against api.anthropic.com on 2026-05-21. The adapter was reading ResultMessage.model, but the actual claude_agent_sdk ResultMessage (>=0.2.x) has no model field at all — model lives on AssistantMessage. Result: SdkResult.model came back as None for every real call. Fix: - Capture model + session_id from AssistantMessage during the stream (last value wins, in case the conversation switches models). - ResultMessage still provides session_id as fallback + the authoritative usage/cost/num_turns. - Refactor the message loop into _ingest_assistant / _ingest_user / _ingest_result helpers to keep complete() under the mccabe complexity limit after the added branches. Added regression test test_complete_captures_model_from_assistant_even_when_result_has_none to lock in the new contract. 19 tests passing (was 18). All quality gates (ruff/black/mypy) clean.

PalmPalm7 · 2026-05-28T13:02:56Z

Local test report — green after one real-world bug catch

Ran the full local test suite plus a live smoke test against api.anthropic.com. All layers pass. The live test surfaced a contract bug in the adapter that's been fixed and pushed (commit af6d243, now part of this PR).

Setup

git fetch origin migration/agent-sdk-adapter
git checkout migration/agent-sdk-adapter

python3 -m venv .venv && source .venv/bin/activate
pip install pyyaml pytest pytest-asyncio fastapi dynaconf ruff black mypy

claude-agent-sdk is intentionally not installed yet — verifying the lazy-import contract first.

Layer 1 — Unit tests with fake SDK

pytest tests/test_agent_sdk_client.py -v

Result: ✅ 19 passed in 0.05s (18 original + 1 regression test added during this session)

Coverage spans:

Runtime flag — default, valid values, unknown-coerces-to-legacy, Dynaconf-style objects
from_config — anthropic.* defaults, agent.sdk.* overrides, empty-config fallback
complete() — error path when SDK not installed, text aggregation across messages, tool_use ↔ tool_result pairing, error flag, exception capture, options pass-through, optional-arg omission, per-call max_turns override
SdkResult.succeeded property

Layer 2 — Quality gates

ruff check src/llm tests/test_agent_sdk_client.py
black --check src/llm tests/test_agent_sdk_client.py
mypy src/llm

Results:

✅ ruff: All checks passed
✅ black: 5 files would be left unchanged
✅ mypy: Success — no issues found in 4 source files

Layer 3 — Lazy import contract

Verified that src.llm is importable in environments where claude-agent-sdk is not installed (which is the default — the package is intentionally not in pyproject.toml deps for this PR).

from src.llm import AgentSdkClient, get_runtime, AgentSdkUnavailableError, SdkResult, SdkUsage
# imports OK
print(get_runtime({}))                                  # → 'legacy'
print(get_runtime({'agent': {'runtime': 'sdk'}}))       # → 'sdk'
print(get_runtime({'agent': {'runtime': 'skd'}}))       # → 'legacy' (logged warning)

Layer 4 — Error path when SDK is absent

import asyncio
from src.llm import AgentSdkClient, AgentSdkUnavailableError
from src.llm.agent_sdk_client import AgentSdkConfig

async def main():
    c = AgentSdkClient(AgentSdkConfig(model='claude-sonnet-4-5'))
    try:
        await c.complete(prompt='hi')
    except AgentSdkUnavailableError as e:
        print(f'Got expected error: {e}')

asyncio.run(main())

Result: ✅ Raises AgentSdkUnavailableError("claude_agent_sdk is not installed. Install with 'pip install claude-agent-sdk' to enable the SDK runtime.") rather than a cryptic ModuleNotFoundError.

Layer 5 — Live smoke test against `api.anthropic.com`

Installed the real SDK, set a personal ANTHROPIC_API_KEY (since rotated), and ran a basic completion:

client = AgentSdkClient(AgentSdkConfig(model='claude-sonnet-4-5', max_turns=3))
result = await client.complete(
    prompt='What is 2+2? Reply in one short sentence.',
    system='You are a concise calculator.',
)

Result (after the fix):

succeeded:      True
model:          claude-sonnet-4-5-20250929
session_id:     6cde09dd-084b-40bb-84de-19a1ccfdc518
text:           '2+2 equals 4.'
tool calls:     0
input_tokens:   10
output_tokens:  98
cache_create:   5894
cache_read:     9948
total_cost_usd: $0.027096
num_turns:      1

A follow-up second call demonstrated cache reuse working — cache_read: 9948 again on the warm path.

🐛 Real bug caught and fixed

The first smoke run failed an assertion: result.model came back as None. Root cause:

I'd written the adapter assuming ResultMessage.model existed, mirroring how the test fakes were structured. The real claude_agent_sdk types module (>= 0.2.x) shows ResultMessage has subtype, usage, total_cost_usd, num_turns, session_id, etc. — but no model field at all. Model lives on AssistantMessage.

The two test fakes ≠ one production SDK. Classic.

Fix in af6d243:

Capture model + session_id from AssistantMessage during the stream (last value wins, in case the conversation switches models mid-flight).
ResultMessage still provides session_id as fallback + the authoritative usage/cost/num_turns.
Refactored the message-loop body into _ingest_assistant / _ingest_user / _ingest_result helpers to keep complete() under the project's mccabe complexity limit (was 21, max 20).
Added regression test test_complete_captures_model_from_assistant_even_when_result_has_none to lock the corrected contract.

💰 Cost data worth flagging

The live smoke test produced concrete evidence for Risk rhpds#4 in the migration plan ("Cost regression from SDK prompt overhead").

This call: $0.027 for "What is 2+2?"
Same prompt via raw anthropic.messages.create(): would be ~$0.0001
~270× per-call overhead at the trivial end

The overhead is the SDK's system prompt + bundled tool definitions, mostly recoverable via prompt caching after the first call (the second call hit cache_read: 9948). But it's real, and worth raising with @yuan when the Phase 2 parity benchmark conversation happens — this is the data shape that benchmark needs to produce at scale.

What this proves

The adapter correctly wires through the SDK's full message stream — text, tool invocations, usage, cost, session ID, and (now) model.
The lazy-import contract holds: src.llm is shippable without claude-agent-sdk installed; only .complete() triggers the import.
All CI quality gates pass — ready for rhpds/parsec:main review.
Live tested against real Anthropic API — not just mocks. The bug we caught is exactly the kind of thing pure-mock testing would have missed.

Ready for upstream review

Suggesting we mark this ready and retarget the base to rhpds/parsec:main for Patrick's review. The adapter is now provably correct against the real SDK, and the cost-overhead finding is concrete input for the Phase 2 benchmark discussion.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: af6d243a3c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-28T13:05:50Z

+        if allowed_tools is not None:
+            options_kwargs["allowed_tools"] = allowed_tools


Deny tools outside allowed_tools

When callers pass allowed_tools as the advertised whitelist for an SDK task, this option only auto-approves those tools; it does not restrict the agent to them. The Claude Agent SDK docs for ClaudeAgentOptions.allowed_tools state that unlisted tools fall through to permission_mode/can_use_tool, so without also setting a denying mode such as permission_mode="dontAsk" (or an equivalent permission callback), an SDK-enabled Parsec run can attempt tools outside the intended sandbox or stall/fail on permission prompts instead of reliably enforcing the supplied tool set.

Useful? React with 👍 / 👎.

PalmPalm7 · 2026-06-01T09:28:59Z

Staging environment test (real OpenShift cluster, live Vertex)

The adapter was exercised end-to-end on a real OpenShift cluster (NERC ocp-beta-test, namespace parsec-palmpalm7) against company Vertex AI — not just unit tests with a fake SDK.

Why a special test path

The standard Parsec image can't exercise AgentSdkClient: claude-agent-sdk shells out to the claude CLI (Node), which the image doesn't bundle, and nothing in the orchestrator calls the adapter yet (that's a later phase). So a purpose-built test image + Job was used.

Procedure

# 1. Provision the Vertex credential secret once (mounts a GCP ADC):
LLM_BACKEND=vertex scripts/deploy.sh

# 2. Build a test image (Parsec branch + Node + claude CLI + claude-agent-sdk)
#    and run a Job that calls AgentSdkClient.complete() against Vertex:
NAMESPACE=parsec-palmpalm7 pr2-test/run-pr2-test.sh

The Job sets CLAUDE_CODE_USE_VERTEX=1 + project/region + the mounted ADC, then runs:

client = AgentSdkClient(AgentSdkConfig(model="claude-sonnet-4-5@20250929"))
result = await client.complete(prompt="Reply with exactly: SDK ADAPTER OK ...")

Result — PASS

succeeded     : True
text          : 'SDK ADAPTER OK'
model         : claude-sonnet-4-5-20250929
usage         : in=10 out=68 cost=$0.093987 turns=1
RESULT: PASS

The SDK spawned the claude CLI, authenticated to Vertex (itpc-gcp-octo-eng-claude / global), and the adapter correctly aggregated the streamed AssistantMessage/ResultMessage flow into SdkResult. Note the returned model claude-sonnet-4-5-20250929 confirms the model-capture fix in commit af6d243 works through the real SDK path (the SDK normalizes the @date form).

Worth flagging for the migration discussion

That trivial call cost $0.094 — consistent with the migration plan's Risk rhpds#4 (SDK prompt overhead vs. a bare messages.create()). Concrete input for the Phase-2 cost/parity benchmark.

What this validates for review

src/llm/ imports cleanly and complete() works against a live model in a real container.
The lazy-import contract holds (the module ships without the SDK; only .complete() needs it).
No change to existing behavior — the adapter is additive and unused by the orchestrator in this PR.

PalmPalm7 · 2026-06-01T09:30:54Z

Moved upstream for review: rhpds#24 (requesting @rut31337). This fork PR can be closed once the upstream one is reviewed.

Adds a "Skills" tab to the sidebar that calls GET /api/skills (added in the skills-loader change) and renders the discovered skills: name, source badge, parsec-native badge, description, version/domain/tool-count, and any loader warnings. Mirrors the existing Examples/Debug tab pattern. Read-only and additive — no behavior change to the agent loop; this is an operator-visibility surface for the skills the loader finds. Skill metadata is rendered via textContent (never innerHTML), so arbitrary on-disk SKILL.md content cannot inject markup. Companion to the skills-loader PR (depends on its /api/skills endpoint).

…e path - .skills-panel: flex:1 + min-height:0 + overflow-y:auto so a tall list scrolls inside the sidebar instead of overflowing the viewport - render each skill's skill_path (where the loader discovered it from)

…to-hide)

docs: updated contributing.md for local testing

feat(ui): read-only Skills sidebar tab (visualizes GET /api/skills)

…onfig fixes Addresses review on rhpds#24: - No timeout/cancellation: a hung sdk.query() leaked the coroutine and the child CLI process. complete() now takes a `timeout` (seconds), defaulting to the new `agent.sdk.timeout` config (300s; null/0 disables). The query loop is wrapped in asyncio.timeout(), and on expiry the result is marked an error instead of hanging. asyncio.timeout(None) is a no-op, so one code path covers bounded + unbounded. - Fragile model capture: warn when AssistantMessage.model changes mid-conversation instead of silently letting "last wins". - Incorrect session_id fallback: the `or state[...]` form read as falling back to a stale id; replaced with an explicit truthy guard that takes the ResultMessage id when present without clobbering the AssistantMessage id. - Config access mismatch: _get_section now coerces to a real dict (mirrors skills loader's _get_skills_section), giving a concrete dict[str, Any] return type. Documented why attribute access (cfg.agent) isn't used — from_config must also accept the plain-dict configs the tests pass. Adds tests: per-call + config-default timeout, model-change warning, session_id preservation, and the timeout config knob. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0f5390b694

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-03T16:14:14Z

+    for block in getattr(message, "content", []) or []:
+        if isinstance(block, sdk.TextBlock):
+            state["text_parts"].append(getattr(block, "text", ""))
+        elif isinstance(block, sdk.ToolUseBlock):


Preserve tool results from assistant content blocks

When the Agent SDK emits a ToolResultBlock inside AssistantMessage.content (the Python SDK reference's progress-monitoring example handles ToolResultBlock while iterating assistant content), this loop drops it because only TextBlock and ToolUseBlock are handled here and _ingest_user() is the only place that pairs results. In SDK runs that use tools, SdkResult.tool_invocations can therefore contain the tool call but no result/is_error, so callers lose the execution outcome even though the tool completed.

Useful? React with 👍 / 👎.

…ion, phase 1) (rhpds#23) * feat(skills): add SKILL.md loader compatible with rhdp-rca-plugin Discovers SKILL.md directories from project, plugin, and user roots, parses YAML frontmatter + Markdown body, and validates against the Anthropic Agent Skills Spec v1.0. Layers a `parsec:` extension block for version, domain, MCP deps, and permissions — unknown keys warn rather than fail so plain Claude Code skills still load. Includes GET /api/skills diagnostic endpoint for operators to verify discovery, and a tests/skills_fixtures/ tree exercising all validation paths. Foundational for upcoming Agent SDK migration (see artifacts/parsec-agent-sdk-migration-plan.md, phase 1 PR #1). No behavior change — additive only, default config registers ./skills which is empty. * fix(skills): reject symlinked skill dirs and cap SKILL.md size Addresses review on rhpds#23: - Symlink path traversal: iterdir() follows symlinks, so a symlinked child of a source root could resolve outside it (relevant because plugin_paths can point at external mounts). Skip symlinked skill directories and validate that each child resolves within the root. Safe for ConfigMap-mounted skills — K8s symlinks the files inside a skill dir (SKILL.md), not the skill dir itself. - Unbounded read: read_text() had no size limit, so an oversized SKILL.md from an untrusted mount could OOM the pod. Stat and reject files over MAX_SKILL_SIZE_BYTES (1 MiB) before reading. Adds tests for both: symlinked-dir skip and oversized-file rejection. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

Brings the SDK-adapter branch up to date with main (now includes the merged skills loader rhpds#23, Skills UI rhpds#25, contributing docs rhpds#22, and the admin/group-based-access changes). Conflict: config/config.yaml — main added the `skills:` section (via rhpds#23) and this branch added the `agent:`/`agent.sdk` section. Resolved by keeping both; they are independent and complementary (agent.runtime selects the runtime, skills.* configures SKILL.md discovery). Local gate green on the merged tree: black, ruff, mypy (src/) + pytest (87 passed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6835602197

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-03T17:26:32Z

+@router.get("/skills")
+async def list_skills():


Require configured auth before listing skills

When Parsec relies on its in-app _check_user_allowed enforcement (as the query, conversations, learnings, and share APIs do), this new handler is reachable without any user headers or group/user whitelist check. A direct request to /api/skills can therefore enumerate configured skill names, allowed tools, metadata, and filesystem paths even for callers who would be rejected by the normal auth path; add the same Request/header parameters and _check_user_allowed gate before loading and serializing manifests.

Useful? React with 👍 / 👎.

PalmPalm7 and others added 2 commits May 19, 2026 05:14

PalmPalm7 mentioned this pull request May 19, 2026

feat(skills): add SKILL.md loader compatible with rhdp-rca-plugin #1

Open

5 tasks

PalmPalm7 marked this pull request as ready for review May 28, 2026 13:03

chatgpt-codex-connector Bot reviewed May 28, 2026

View reviewed changes

docs: updated contributing.md for local testing

2dc84b9

PalmPalm7 and others added 6 commits June 1, 2026 06:12

fix(ui): force persistent scrollbar on skills panel (macOS overlay au…

8eff84a

…to-hide)

Merge pull request rhpds#22 from jerry-ng2/docs-update-contributing

a1da401

docs: updated contributing.md for local testing

Merge pull request rhpds#25 from PalmPalm7/migration/skills-ui-pr

0da179c

feat(ui): read-only Skills sidebar tab (visualizes GET /api/skills)

chatgpt-codex-connector Bot reviewed Jun 3, 2026

View reviewed changes

PalmPalm7 and others added 2 commits June 3, 2026 11:53

chatgpt-codex-connector Bot reviewed Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): add Claude Agent SDK adapter behind agent.runtime feature flag#2

feat(llm): add Claude Agent SDK adapter behind agent.runtime feature flag#2
PalmPalm7 wants to merge 12 commits into
mainfrom
migration/agent-sdk-adapter

PalmPalm7 commented May 19, 2026

Uh oh!

PalmPalm7 commented May 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 28, 2026

Uh oh!

PalmPalm7 commented Jun 1, 2026

Uh oh!

PalmPalm7 commented Jun 1, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if allowed_tools is not None:
		options_kwargs["allowed_tools"] = allowed_tools

Conversation

PalmPalm7 commented May 19, 2026

Summary

What's in the box

Why lazy import

Usage shape (for reviewers)

Test plan

Uh oh!

PalmPalm7 commented May 28, 2026

Local test report — green after one real-world bug catch

Setup

Layer 1 — Unit tests with fake SDK

Layer 2 — Quality gates

Layer 3 — Lazy import contract

Layer 4 — Error path when SDK is absent

Layer 5 — Live smoke test against api.anthropic.com

🐛 Real bug caught and fixed

💰 Cost data worth flagging

What this proves

Ready for upstream review

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

PalmPalm7 commented Jun 1, 2026

Staging environment test (real OpenShift cluster, live Vertex)

Why a special test path

Procedure

Result — PASS

Worth flagging for the migration discussion

What this validates for review

Uh oh!

PalmPalm7 commented Jun 1, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Layer 5 — Live smoke test against `api.anthropic.com`