Skip to content

feat: optional batch submission to LLM providers #20

@codeninja

Description

@codeninja

Summary

The walk's Extract stage submits files to the LLM one at a time. Some providers expose a batch endpoint (e.g. Anthropic Message Batches, OpenAI Batch API) that processes many requests asynchronously at lower cost and higher throughput. wikifi should optionally use it when the configured provider supports it.

Scope

  • Add an optional batch capability to the provider abstraction — providers that support batching declare it; those that don't (e.g. local Ollama) keep using sequential per-file calls
  • Extract stage groups in-scope files into a batch submission when batching is available
  • Poll / await batch completion, then map results back to their source files
  • Configurable: opt-in flag (e.g. wikifi walk --batch) or config setting; default remains sequential so local-LLM-by-default behavior is unchanged
  • Stay within the provider boundary — no batch logic leaks into the walk stages beyond the submit/collect seam

Notes

  • Local LLM (default) has no batch endpoint — non-batch providers must continue working unchanged via the existing sequential path; batching is strictly additive
  • Per CLAUDE.md, if/when the hosted-Anthropic backend gains batch support, invoke the claude-api skill while implementing it (it covers Message Batches)
  • Interacts with feat: progress bars for CLI walk stages #19 (progress bars) — batched extraction reports progress differently (submitted → in progress → completed) than sequential; coordinate the progress surface

Acceptance criteria

  • Provider abstraction exposes a batch capability that providers opt into
  • A batch-capable provider processes the Extract stage via the batch endpoint when enabled
  • Non-batch providers (local default) continue via the sequential path with no behavior change
  • Batching is opt-in; default path unchanged
  • Tests cover batch submit, result mapping, and the non-batch path (≥85% coverage maintained)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions