Skip to content

Multi-LLM backend for /api/ai/triage (Anthropic / OpenAI / Doubao / Grok)#9

Merged
lai3d merged 1 commit into
mainfrom
claude/multi-llm-provider
May 18, 2026
Merged

Multi-LLM backend for /api/ai/triage (Anthropic / OpenAI / Doubao / Grok)#9
lai3d merged 1 commit into
mainfrom
claude/multi-llm-provider

Conversation

@lai3d
Copy link
Copy Markdown
Owner

@lai3d lai3d commented May 18, 2026

Summary

Makes the AI triage endpoint provider-agnostic. The LLM backend is now selected at startup via LLM_PROVIDER env var; four providers ship out of the box:

LLM_PROVIDER Endpoint Default model
anthropic (default) api.anthropic.com/v1/messages claude-sonnet-4-6
openai api.openai.com/v1/chat/completions gpt-4o-mini
doubao ark.cn-beijing.volces.com/api/v3/chat/completions doubao-pro-32k
grok api.x.ai/v1/chat/completions grok-3

Same prompt, same response schema. Operators pick whichever fits their stack — Doubao for Volcengine-hosted deployments, OpenAI/Grok for users with existing keys, Anthropic for the prompt-cache savings.

Why

The endpoint was previously hard-coded to Anthropic. Adding a provider abstraction:

  1. Removes vendor lock-in. Locking the AI brain to one LLM was the wrong call for a platform meant to deploy in different operator environments.
  2. Natural fit for Volcengine deployments. Doubao is Volcengine's own LLM and speaks the OpenAI-compatible chat-completions protocol. Same LLM_API_KEY, different LLM_PROVIDER.
  3. No Anthropic signup overhead. Operators who already have OpenAI / Grok keys can use them directly.

Implementation notes

  • New LlmProvider enum in routes/ai_triage.rs. Anthropic keeps its own messages endpoint and cache_control: ephemeral prompt caching; OpenAI / Doubao / Grok share one call_openai_compatible implementation that speaks the standard chat-completions wire format with Authorization: Bearer auth.
  • TriageResponse gains a provider field so callers can see which backend produced a response — useful when rotating providers during testing.
  • Config: switched to LLM_PROVIDER + LLM_API_KEY. Back-compat: existing ANTHROPIC_API_KEY still works when LLM_PROVIDER is anthropic or unset, so existing deployments don't need env-var rotation.
  • Aliases: claude → anthropic, gpt → openai, volcengine/ark → doubao, xai → grok.
  • Zero new crate dependencies — still just reqwest + serde_json for HTTP and JSON.

Test plan

  • cargo check clean
  • cargo test routes::ai_triage — 7/7 pass (4 original parse tests + 3 new: provider label propagation, LlmProvider::parse alias matrix, default-model + Default impl)
  • Live test against each provider (separate step — each needs a real API key)
  • Verify Anthropic prompt-cache hits via cache_read_input_tokens in logs (provider-specific signal)
  • Verify OpenAI-compatible cached-tokens reporting via prompt_tokens_details.cached_tokens for providers that expose it

…Grok

Adds an `LlmProvider` enum and a single dispatcher so the triage
endpoint can run against any of four backends, selected at startup via
the `LLM_PROVIDER` env var. The same prompt and response schema work
across all four; operators pick the provider that fits their stack.

Why this matters:
  - The endpoint was previously hard-coded to Anthropic. For
    Volcengine-hosted deployments, operators want Doubao; for users
    with existing OpenAI/Grok keys, no Anthropic signup overhead;
    for the rest, Anthropic remains the default with its prompt-cache
    advantage.
  - Adds a clean talking point: the AI brain is provider-agnostic by
    design. Locking the platform to one LLM vendor would have been a
    bad call.

Implementation:
  - `LlmProvider` enum (Anthropic / OpenAI / Doubao / Grok) with
    `parse`, `as_str`, `default_model`, `Default = Anthropic`. Aliases
    accepted (claude, gpt, volcengine, ark, xai).
  - `call_llm` dispatches to `call_anthropic` (existing, preserves
    `cache_control: ephemeral`) or `call_openai_compatible` (new,
    shared by OpenAI/Doubao/Grok — same chat/completions wire format,
    same `Authorization: Bearer` auth).
  - `TriageResponse` gains a `provider` field so the caller can see
    which backend produced a response (useful when rotating providers
    during testing).
  - Config switches from `ANTHROPIC_API_KEY` to `LLM_PROVIDER` +
    `LLM_API_KEY`. Back-compat: `ANTHROPIC_API_KEY` still honored
    when `LLM_PROVIDER` is anthropic or unset — existing deployments
    don't need to rotate env vars.

Tests:
  - 4 existing parse tests still pass.
  - 3 new tests: provider label propagation across all parse paths,
    `LlmProvider::parse` alias matrix, default-model + Default impl.

Zero new crate dependencies — still just reqwest + serde_json.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lai3d lai3d merged commit e9e4741 into main May 18, 2026
1 check passed
@lai3d lai3d deleted the claude/multi-llm-provider branch May 18, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant