Skip to content

feat(agent): add ArtifactHandle store + provider tool-call adapters (W1a)#61

Merged
Hongjiseung-ROK merged 1 commit into
mainfrom
session/cs-61
May 10, 2026
Merged

feat(agent): add ArtifactHandle store + provider tool-call adapters (W1a)#61
Hongjiseung-ROK merged 1 commit into
mainfrom
session/cs-61

Conversation

@Hongjiseung-ROK

Copy link
Copy Markdown
Owner

Implements Wave 1a from docs/research/multi_tool_agent_design.md (research PR #60), following sections 1, 2, 3, and 7.

Scope in this PR:

  • add HandleStore with typed artifact ids and handles.jsonl persistence
  • add provider-normalization adapters for Anthropic/OpenAI tool calls and tool results
  • wire handle resolution/storage into AgentSession._execute_step() without changing the existing planner/critic loop
  • add focused tests for handle persistence, provider parsing, and _execute_step() handle behavior

Non-goals intentionally left for later waves:

  • no tool-loop orchestrator yet (W1b)
  • no public AgentSession.run() return-shape changes
  • no new DecisionLog kinds
  • no TUI/permission/driving mode work

Validation:

  • AI_PROVIDER=openai python -m pytest -v tests/agent/test_handles.py tests/agent/test_provider_adapter.py tests/agent/test_handles_in_execute_step.py
  • black -l 79 .
  • isort --gitignore .
  • ruff check . --fix

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a session-scoped artifact handle system via a new HandleStore class and a provider_adapter module to normalize tool interactions with Anthropic and OpenAI. The review identifies several critical improvements: ensuring HandleStore.put accepts non-dictionary summaries to avoid TypeErrors, adding support for dot notation in handle resolution for consistency with legacy references, and removing fallback ID generation in the provider adapters which would lead to API failures.

Comment on lines +53 to +54
if not isinstance(summary, dict):
raise TypeError("Handle summaries must be dict objects")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The put method strictly requires the summary to be a dict. However, the _json_safe utility (and the _preview_value function in core.py) can return other JSON-serializable types like strings, lists, or numbers. If a tool returns a simple scalar or a list and we attempt to store a handle for it, this will raise a TypeError. Consider allowing any JSON-safe type for the summary to improve robustness.

Comment thread chemsmart/agent/core.py
Comment on lines +1631 to +1635
if handle_store is not None and is_handle_id(value):
try:
return handle_store.get(value)
except KeyError:
return _restore_json_result(handle_store.get_summary(value))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The handle resolution logic currently only supports exact handle ID matches and does not allow property access via dot notation (e.g., mol_abcd.symbols). In contrast, the $stepN reference logic (lines 1622-1628) fully supports paths. This inconsistency limits the LLM's ability to interact with specific attributes of handle-referenced objects. Supporting paths for handles would provide a more consistent and powerful interface for the agent.

continue
if block_type != "tool_use":
continue
provider_call_id = str(block.get("id") or f"anthropic-{index}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Inventing a provider_call_id when it's missing from the Anthropic response is problematic. Anthropic requires the tool_use_id in the result message to exactly match the one it provided in the tool_use block. If a fallback ID is used, the subsequent API call to submit tool results will fail with a 400 error. It is safer to validate that the ID exists and raise a ValueError if it doesn't, as this indicates a malformed response from the provider.

Suggested change
provider_call_id = str(block.get("id") or f"anthropic-{index}")
provider_call_id = block.get("id")
if not isinstance(provider_call_id, str) or not provider_call_id:
raise ValueError(f"Anthropic tool_use block at index {index} is missing 'id'")

raise ValueError(
"OpenAI function arguments must decode to a JSON object"
)
provider_call_id = str(call.get("id") or f"openai-{index}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the Anthropic adapter, using a fallback provider_call_id for OpenAI tool calls when the id is missing will lead to API errors when submitting results. OpenAI expects the tool_call_id to match an existing call. Validating the presence of the ID and raising an error if it's missing is preferred for early detection of provider communication issues.

        provider_call_id = call.get("id")
        if not isinstance(provider_call_id, str) or not provider_call_id:
            raise ValueError(f"OpenAI tool_call at index {index} is missing 'id'")

@Hongjiseung-ROK Hongjiseung-ROK marked this pull request as ready for review May 10, 2026 03:08
@Hongjiseung-ROK Hongjiseung-ROK merged commit 32bc92d into main May 10, 2026
@Hongjiseung-ROK Hongjiseung-ROK deleted the session/cs-61 branch May 10, 2026 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant