Skip to content

docs: multi-tool agent loop + permission modes design#60

Draft
Hongjiseung-ROK wants to merge 1 commit into
mainfrom
session/cs-60
Draft

docs: multi-tool agent loop + permission modes design#60
Hongjiseung-ROK wants to merge 1 commit into
mainfrom
session/cs-60

Conversation

@Hongjiseung-ROK

Copy link
Copy Markdown
Owner

Summary

  • add a research-only design document for converting chemsmart's planner/critic/execute pipeline into a true multi-tool agent loop
  • cover provider-native tool-loop orchestration, permission mode, driving mode, critic redesign, backward compatibility, failure modes, and staged migration
  • cite verified local code references and required web research sources

Notes

  • research-only change; no source code or test changes
  • this is not a Wave implementation PR
  • bin/plan.md was intentionally not linked because this task explicitly stated it does not exist for this research-only request

Validation

  • not run (docs-only research task)

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive design document for transitioning the chemsmart agent from a static planning model to a dynamic tool-loop architecture. Key features include a provider-neutral orchestrator, a handle-based system for managing complex chemistry objects, and distinct Permission and Driving modes for varying levels of autonomy. Feedback focuses on resolving inconsistencies in turn budget definitions across sections and refining the truncation strategy to prevent context window overflows when multiple tool results are returned simultaneously.

Comment on lines +722 to +723
- `max_model_steps_per_turn = 10`
- `max_total_tool_calls_per_turn = 24`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a discrepancy between the turn budgets defined in Section 3 (lines 465-466) and Section 5 (lines 722-723). Section 3 recommends a limit of 12 model steps and 32 tool calls per turn, whereas Section 5 specifies 10 steps and 24 tool calls for Driving Mode. Please clarify if Driving Mode is intended to have more restrictive limits or if these values should be synchronized.

- `max_total_tool_calls_per_turn = 24`
- `max_prompt_tokens_from_history = 4_000`
- `max_model_output_tokens_per_step = 2_048`
- `max_total_tool_result_chars_in_context = 32_000`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proposed per-tool truncation limit of 12,000 characters (line 489) may conflict with the total context limit of 32,000 characters (line 726) when multiple tool calls occur in a single turn. Given that the design allows up to 4 parallel tool calls (line 469), the aggregate length could reach 48,000 characters, exceeding the turn budget. The design should specify a strategy for prioritizing or further truncating results when the combined length of multiple tool outputs exceeds the total context allowance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant