Multi-LLM backend for /api/ai/triage (Anthropic / OpenAI / Doubao / Grok) by lai3d · Pull Request #9 · lai3d/sigma

lai3d · 2026-05-18T18:25:58Z

Summary

Makes the AI triage endpoint provider-agnostic. The LLM backend is now selected at startup via LLM_PROVIDER env var; four providers ship out of the box:

`LLM_PROVIDER`	Endpoint	Default model
`anthropic` (default)	`api.anthropic.com/v1/messages`	`claude-sonnet-4-6`
`openai`	`api.openai.com/v1/chat/completions`	`gpt-4o-mini`
`doubao`	`ark.cn-beijing.volces.com/api/v3/chat/completions`	`doubao-pro-32k`
`grok`	`api.x.ai/v1/chat/completions`	`grok-3`

Same prompt, same response schema. Operators pick whichever fits their stack — Doubao for Volcengine-hosted deployments, OpenAI/Grok for users with existing keys, Anthropic for the prompt-cache savings.

Why

The endpoint was previously hard-coded to Anthropic. Adding a provider abstraction:

Removes vendor lock-in. Locking the AI brain to one LLM was the wrong call for a platform meant to deploy in different operator environments.
Natural fit for Volcengine deployments. Doubao is Volcengine's own LLM and speaks the OpenAI-compatible chat-completions protocol. Same LLM_API_KEY, different LLM_PROVIDER.
No Anthropic signup overhead. Operators who already have OpenAI / Grok keys can use them directly.

Implementation notes

New LlmProvider enum in routes/ai_triage.rs. Anthropic keeps its own messages endpoint and cache_control: ephemeral prompt caching; OpenAI / Doubao / Grok share one call_openai_compatible implementation that speaks the standard chat-completions wire format with Authorization: Bearer auth.
TriageResponse gains a provider field so callers can see which backend produced a response — useful when rotating providers during testing.
Config: switched to LLM_PROVIDER + LLM_API_KEY. Back-compat: existing ANTHROPIC_API_KEY still works when LLM_PROVIDER is anthropic or unset, so existing deployments don't need env-var rotation.
Aliases: claude → anthropic, gpt → openai, volcengine/ark → doubao, xai → grok.
Zero new crate dependencies — still just reqwest + serde_json for HTTP and JSON.

Test plan

cargo check clean
cargo test routes::ai_triage — 7/7 pass (4 original parse tests + 3 new: provider label propagation, LlmProvider::parse alias matrix, default-model + Default impl)
Live test against each provider (separate step — each needs a real API key)
Verify Anthropic prompt-cache hits via cache_read_input_tokens in logs (provider-specific signal)
Verify OpenAI-compatible cached-tokens reporting via prompt_tokens_details.cached_tokens for providers that expose it

…Grok Adds an `LlmProvider` enum and a single dispatcher so the triage endpoint can run against any of four backends, selected at startup via the `LLM_PROVIDER` env var. The same prompt and response schema work across all four; operators pick the provider that fits their stack. Why this matters: - The endpoint was previously hard-coded to Anthropic. For Volcengine-hosted deployments, operators want Doubao; for users with existing OpenAI/Grok keys, no Anthropic signup overhead; for the rest, Anthropic remains the default with its prompt-cache advantage. - Adds a clean talking point: the AI brain is provider-agnostic by design. Locking the platform to one LLM vendor would have been a bad call. Implementation: - `LlmProvider` enum (Anthropic / OpenAI / Doubao / Grok) with `parse`, `as_str`, `default_model`, `Default = Anthropic`. Aliases accepted (claude, gpt, volcengine, ark, xai). - `call_llm` dispatches to `call_anthropic` (existing, preserves `cache_control: ephemeral`) or `call_openai_compatible` (new, shared by OpenAI/Doubao/Grok — same chat/completions wire format, same `Authorization: Bearer` auth). - `TriageResponse` gains a `provider` field so the caller can see which backend produced a response (useful when rotating providers during testing). - Config switches from `ANTHROPIC_API_KEY` to `LLM_PROVIDER` + `LLM_API_KEY`. Back-compat: `ANTHROPIC_API_KEY` still honored when `LLM_PROVIDER` is anthropic or unset — existing deployments don't need to rotate env vars. Tests: - 4 existing parse tests still pass. - 3 new tests: provider label propagation across all parse paths, `LlmProvider::parse` alias matrix, default-model + Default impl. Zero new crate dependencies — still just reqwest + serde_json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lai3d merged commit e9e4741 into main May 18, 2026
1 check passed

lai3d deleted the claude/multi-llm-provider branch May 18, 2026 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-LLM backend for /api/ai/triage (Anthropic / OpenAI / Doubao / Grok)#9

Multi-LLM backend for /api/ai/triage (Anthropic / OpenAI / Doubao / Grok)#9
lai3d merged 1 commit into
mainfrom
claude/multi-llm-provider

lai3d commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lai3d commented May 18, 2026

Summary

Why

Implementation notes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant