Skip to content

ReAct: parallel tool calls per turn #83

@VarunGitGood

Description

@VarunGitGood

Why

The ReAct loop in repi/investigation/react_loop.py runs strictly one tool call per LLM turn. Investigations that need to inspect several services or windows pay the latency + token cost of N full ReAct iterations for what is conceptually a single fan-out. All major providers (OpenAI, Anthropic, Mistral, Gemini) support parallel function / tool calls in a single assistant turn — the model emits multiple tool calls and the runtime dispatches them concurrently.

Scope (in)

  • Update the ReAct loop to recognise multiple tool calls in a single assistant message (the providers' adapters already return tool-call arrays where supported).
  • Dispatch them concurrently via asyncio.gather, respecting the configured rate limit (see R4: Make LLM rate limit configurable #13).
  • Persist each tool call + observation as its own step row in investigation_steps so the audit trail stays granular.
  • Update the system prompt to tell the model it may batch independent calls in one turn (e.g. inspect three services for the same window).
  • Keep single-call behaviour as the default for providers / models that don't support batching.

Scope (out)

  • Cross-turn dependency resolution (the model still serialises calls when one depends on another's output — that's a model decision, not loop behaviour).
  • Provider adapter changes beyond surfacing the tool-call array.

Acceptance

  • A query like "compare error rates across api-gateway, auth-service, payments in the last hour" finishes in noticeably fewer ReAct iterations than today.
  • The persisted investigation_steps table still has one row per tool call (not per turn), so the UI's step view is unchanged.
  • Existing single-call behaviour passes the eval harness without regression.

Files

  • repi/investigation/react_loop.py
  • repi/llm/adapters.py — confirm each provider surfaces the tool-call array.
  • repi/investigation/store.py — per-step persistence (likely no change).
  • tests/investigation/test_react_loop.py

Depends on

#13 (rate-limit configurability) — parallel calls amplify rate-limit pressure; do that one first or together.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestreact-qualityReAct loop reasoning improvements

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions