feat(agent): add ArtifactHandle store + provider tool-call adapters (W1a)#61
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a session-scoped artifact handle system via a new HandleStore class and a provider_adapter module to normalize tool interactions with Anthropic and OpenAI. The review identifies several critical improvements: ensuring HandleStore.put accepts non-dictionary summaries to avoid TypeErrors, adding support for dot notation in handle resolution for consistency with legacy references, and removing fallback ID generation in the provider adapters which would lead to API failures.
| if not isinstance(summary, dict): | ||
| raise TypeError("Handle summaries must be dict objects") |
There was a problem hiding this comment.
The put method strictly requires the summary to be a dict. However, the _json_safe utility (and the _preview_value function in core.py) can return other JSON-serializable types like strings, lists, or numbers. If a tool returns a simple scalar or a list and we attempt to store a handle for it, this will raise a TypeError. Consider allowing any JSON-safe type for the summary to improve robustness.
| if handle_store is not None and is_handle_id(value): | ||
| try: | ||
| return handle_store.get(value) | ||
| except KeyError: | ||
| return _restore_json_result(handle_store.get_summary(value)) |
There was a problem hiding this comment.
The handle resolution logic currently only supports exact handle ID matches and does not allow property access via dot notation (e.g., mol_abcd.symbols). In contrast, the $stepN reference logic (lines 1622-1628) fully supports paths. This inconsistency limits the LLM's ability to interact with specific attributes of handle-referenced objects. Supporting paths for handles would provide a more consistent and powerful interface for the agent.
| continue | ||
| if block_type != "tool_use": | ||
| continue | ||
| provider_call_id = str(block.get("id") or f"anthropic-{index}") |
There was a problem hiding this comment.
Inventing a provider_call_id when it's missing from the Anthropic response is problematic. Anthropic requires the tool_use_id in the result message to exactly match the one it provided in the tool_use block. If a fallback ID is used, the subsequent API call to submit tool results will fail with a 400 error. It is safer to validate that the ID exists and raise a ValueError if it doesn't, as this indicates a malformed response from the provider.
| provider_call_id = str(block.get("id") or f"anthropic-{index}") | |
| provider_call_id = block.get("id") | |
| if not isinstance(provider_call_id, str) or not provider_call_id: | |
| raise ValueError(f"Anthropic tool_use block at index {index} is missing 'id'") |
| raise ValueError( | ||
| "OpenAI function arguments must decode to a JSON object" | ||
| ) | ||
| provider_call_id = str(call.get("id") or f"openai-{index}") |
There was a problem hiding this comment.
Similar to the Anthropic adapter, using a fallback provider_call_id for OpenAI tool calls when the id is missing will lead to API errors when submitting results. OpenAI expects the tool_call_id to match an existing call. Validating the presence of the ID and raising an error if it's missing is preferred for early detection of provider communication issues.
provider_call_id = call.get("id")
if not isinstance(provider_call_id, str) or not provider_call_id:
raise ValueError(f"OpenAI tool_call at index {index} is missing 'id'")
Implements Wave 1a from
docs/research/multi_tool_agent_design.md(research PR #60), following sections 1, 2, 3, and 7.Scope in this PR:
HandleStorewith typed artifact ids andhandles.jsonlpersistenceAgentSession._execute_step()without changing the existing planner/critic loop_execute_step()handle behaviorNon-goals intentionally left for later waves:
AgentSession.run()return-shape changesDecisionLogkindsValidation:
AI_PROVIDER=openai python -m pytest -v tests/agent/test_handles.py tests/agent/test_provider_adapter.py tests/agent/test_handles_in_execute_step.pyblack -l 79 .isort --gitignore .ruff check . --fix