Skip to content

feat: add Concentrate AI as an optional model router with ZDR-aware routing and live catalogs#167

Open
concentrate-ai wants to merge 17 commits into
willchen96:mainfrom
concentrate-ai:pr/concentrate-router-final
Open

feat: add Concentrate AI as an optional model router with ZDR-aware routing and live catalogs#167
concentrate-ai wants to merge 17 commits into
willchen96:mainfrom
concentrate-ai:pr/concentrate-router-final

Conversation

@concentrate-ai
Copy link
Copy Markdown

What this adds

This PR adds Concentrate AI (https://concentrate.ai) as an optional, additive model provider. Users who add a Concentrate API key get:

  1. Live model catalogs — the model picker no longer relies on a hard-coded list. On first open it fetches the catalog for each configured provider (Anthropic, Google, OpenAI, and Concentrate), filters to chat-capable models, and merges them into a single grouped list. A 5-minute in-memory cache makes repeat opens instant. Users can also force-refresh via the new Refresh model list button in Settings → Models.

  2. ZDR-aware routing — Concentrate's API exposes a per-model zdr (Zero Data Retention) flag. For models tagged as ZDR, if the user has both a direct provider key and a Concentrate key, the backend routes through Concentrate — the provider never sees the plaintext prompt. For non-ZDR models the backend continues to
    prefer the direct provider key; Concentrate is only used as a fallback if no direct key is configured. The routing is surfaced in the UI via a small ZDR pill badge in the picker.

  3. Concentrate is completely optional — no behavior changes if the Concentrate key field is left blank. All four provider fields appear in Settings → Models with plain-English descriptions of when each is used for routing.

  4. I added some pre-push hooks with a security audit + typecheck on every push. I'd love to add some ci/cd

Schema / migration

One additive change: 'concentrate' added to the provider check constraint in user_api_keys. Fresh installs run schema.sql as usual. Existing deployments apply backend/migrations/001_add_concentrate_provider.sql.

Manual test plan

  • No keys — picker shows static fallback list; no background network calls.
  • Add a Gemini key — open picker; Google models load from live catalog.
  • Add an OpenAI key — open picker; OpenAI models load.
  • Then remove a key — re-open picker; ensure provider's models disappear.
  • Add a Concentrate key — ZDR models appear with the ZDR pill badge; tooltip reads "Zero Data Retention…".
  • Send a chat message — backend log shows the selected model ID, not the default fallback.
  • Settings → Models → Refresh model list — clears cache and refetches.

Known gaps (not blockers)

  • ZDR set warms lazily — routing doesn't enforce ZDR until /concentrate/models has been hit once
  • No backend enforcement that the posted model ID is in any catalog.
  • OpenAI family regex needs manual extension when OpenAI ships new families.

If you are interested in some free tokens please email me at todd@concentrate.ai and I'll be sure to set you up with a generous grant.

concentrate-ai and others added 17 commits May 30, 2026 18:32
Adds .githooks/ with a pre-push hook that runs on every push:

- npm audit (backend + frontend) — blocks on critical/high vulnerabilities.
  A 24-hour grace period prevents supply-chain attacks where a malicious
  "fix" advisory pressures rapid upgrades. Known transitive vulnerabilities
  with no upstream fix are documented in audit-known.json with expiration
  dates so they are revisited rather than forgotten.
- TypeScript type checking (backend + frontend) — blocks on type errors.
- ESLint (frontend) — advisory only; warns but does not block until existing
  upstream lint issues are resolved.

Enable with: git config core.hooksPath .githooks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Gemini and OpenAI model IDs were speculative version numbers that do
not exist as real API slugs (gemini-3.1-pro-preview, gpt-5.5, etc.).
Replace with current production slugs verified against each provider's
API documentation:

- gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite (Google)
- gpt-4o, gpt-4o-mini (OpenAI)

Also update the user_profiles.tabular_model default in schema.sql from
the same fabricated 'gemini-3-flash-preview' to 'gemini-2.0-flash' so
fresh installs do not write an invalid model slug into new profiles.

Claude slugs were already correct and are unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Concentrate exposes an OpenAI-Responses-compatible endpoint that proxies
to many underlying model authors (Anthropic, Google, OpenAI, …) behind a
single API key. This commit adds it as a fourth, optional provider.

How it routes (in order):

  1. Dynamic model IDs (not in the static set) → Concentrate.
  2. Static model with a direct provider key configured → use that key
     directly. A Concentrate key never silently overrides a configured
     direct key.
  3. Static model with no direct key but a Concentrate key configured →
     Concentrate acts as a fallback universal router.
  4. No key at all → routed to the native provider so it surfaces a
     clear "key not configured" error.

This means a user with both an Anthropic key and a Concentrate key
continues to use Anthropic directly for Claude models — Concentrate only
takes over when there is no direct key.

Backend
- backend/src/lib/llm/concentrate.ts: OpenAI-Responses adapter pointed
  at https://api.concentrate.ai/v1/responses (overridable via
  CONCENTRATE_RESPONSES_URL). After a tool turn the request loop now
  appends both the assistant function_call items and the
  function_call_output items so the model sees a complete tool round.
- backend/src/lib/llm/index.ts: new pick() encodes the routing rules.
- backend/src/lib/llm/types.ts: extend Provider + UserApiKeys with
  "concentrate".
- backend/src/lib/llm/models.ts: providerForModel() returns
  "concentrate" for unknown slugs instead of throwing; isStaticModel()
  helper added.
- backend/src/lib/userApiKeys.ts, userSettings.ts: support concentrate
  as a valid provider for the encrypted key store and the resolved
  title-model fallback.
- backend/src/routes/concentrateModels.ts: GET /concentrate/models
  returns the authorized model catalog from Concentrate's /v1/models
  with a 5-minute in-process cache.
- backend/schema.sql: extend user_api_keys provider CHECK to include
  'concentrate'.
- backend/.env.example: add CONCENTRATE_API_KEY and the optional
  CONCENTRATE_RESPONSES_URL override.

Frontend
- frontend/src/app/lib/mikeApi.ts: extend ApiKeyProvider union; export
  apiRequest so the new lib can reuse the auth wrapper.
- frontend/src/app/lib/concentrateModels.ts: client for
  /concentrate/models with a matching 5-minute cache.
- frontend/src/app/lib/modelAvailability.ts: a Concentrate key makes
  any static model "available" (since Concentrate can route it).
- frontend/src/app/(pages)/account/models/page.tsx: render the
  Concentrate API-key field at the bottom of the API-keys list with a
  short description of what it does.
- frontend/src/contexts/UserProfileContext.tsx: include concentrate
  in the empty-key initializer and the provider iteration list.

The model picker itself is unchanged in this commit — Concentrate
shows up purely as a key field and a routing fallback. Surfacing the
dynamic Concentrate catalog in the picker is a follow-up.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a Concentrate key is configured, the model picker now lists the
authorized Concentrate catalog inline alongside the static Anthropic /
Google / OpenAI models, grouped by model author. A small shield icon
marks Zero Data Retention models so users can pick a ZDR model at a
glance.

The Concentrate catalog is filtered to ZDR-only models server-side:
backend/src/routes/concentrateModels.ts checks each model's providers
map for at least one zdr-certified backend and drops everything else.
This is enforced at the source so a misbehaving client cannot pull
non-ZDR slugs by skipping a UI filter. Users who want a non-ZDR model
should configure that provider's direct key rather than rely on the
Concentrate router.

Backend
- backend/src/routes/concentrateModels.ts: add zdr:boolean to the
  returned model shape; filter the API response so only models with at
  least one ZDR-advertising Concentrate-side provider are included.

Frontend
- frontend/src/app/lib/concentrateModels.ts: extend the ConcentrateModel
  type with a zdr boolean.
- frontend/src/app/components/assistant/ModelToggle.tsx: relax the group
  field to string, fetch Concentrate models on Concentrate-key change,
  merge them into the picker grouped by author, render a shield icon
  next to ZDR models. The static MODELS export and ALLOWED_MODEL_IDS
  remain stable so routing-layer callers are unaffected.
- frontend/src/app/lib/modelAvailability.ts: modelGroupToProvider now
  returns null for non-static groups; isModelAvailable treats unknown
  model IDs as available iff a Concentrate key is configured.
- frontend/src/app/(pages)/account/models/page.tsx: defensively skip the
  "Add an X API key" tooltip when the model has no native provider.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the static ALLOWED_MODEL_IDS list with live catalogs fetched from
each provider's /models endpoint (Anthropic, Google, OpenAI, Concentrate).
Every model now carries explicit capability flags (chat, tools, streaming)
that default to FALSE — the picker only surfaces models whose capabilities
have been verified, so new model classes (audio, image, embedding) stay
hidden until someone teaches Mike about them.

When Concentrate's catalog tags a model as ZDR and the user has a
Concentrate key, routing prefers Concentrate over the direct provider key
so the privacy guarantee advertised by the ZDR badge is actually enforced
server-side. Non-ZDR models continue to prefer the direct key, so users
never lose access to brand-new frontier releases while waiting for ZDR
certification.

- Add /providers/:provider/models backend routes with per-provider
  capability derivation (Anthropic metadata, OpenAI family table,
  Google supportedGenerationMethods)
- Rewrite /concentrate/models to return the unified ProviderCatalogModel
  shape with capabilities derived from Concentrate's supports payload
- Add process-local ZDR set (concentrateCatalog.ts) populated when the
  Concentrate catalog is fetched, consulted by pick() for routing
- ModelToggle picker now unions all four live catalogs filtered on
  capabilities.chat, with ZDR badge (gray pill + shield) on qualifying
  models
- useSelectedModel accepts any plausible model ID shape instead of
  gating on a build-time allowlist
- modelAvailability uses prefix-based provider inference for dynamic IDs
- Bump deprecated gemini-2.0-* defaults to gemini-2.5-* throughout
- Add o1/o3/o4 prefix recognition for OpenAI reasoning models

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Model catalogs now fetch on picker open rather than component mount,
so no background network calls happen until the user actually wants to
pick a model. The 5-minute in-memory cache makes repeat opens instant;
stale cache is served indefinitely on network failure so the picker
never goes empty.

Adds a "Refresh model list" button to Settings → Models that clears
all provider caches so the next picker open pulls fresh catalogs. The
button is only shown when at least one API key is configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- useSelectedModel: replace useState+useEffect pattern with lazy initializer
  so the stored model ID is read synchronously on first render, before the
  catalog fetch completes
- ModelToggle: add labelFromId() fallback so stored IDs not in STATIC_FALLBACK
  (e.g. claude-opus-4-8 from a previous session) still display a readable
  label rather than "Model"
- Add date-stamped variants (gpt-5-2025-08-07, o3-2025-04-16, etc.)
- Add pro variants (gpt-5-pro, o1-pro, o3-pro)
- Add chat-latest aliases (chat-latest, gpt-5-chat-latest, etc.)
- Block codex and deep-research IDs (404 on chat completions API)
- Collapse o-series regex to single pattern covering o1-o9 with
  mini/preview/pro suffixes and date stamps
chat-latest and *-chat-latest IDs have no gpt-/o prefix so
providerForModel() and inferProviderFromId() fell through to
concentrate/null, showing the red unavailable badge in the picker
even with an OpenAI key configured.
resolveModel() gated on a static ALL_MODELS allowlist, causing any model
from the live catalog (gpt-4.1, claude-opus-4-8, etc.) to silently
downgrade to the default model on chat and to reject on tabularModel
save with "Unsupported tabularModel". The validation belongs at the
system boundary (looksLikeModelId on frontend, pick() key check in the
router) — not in a dispatch helper that sees IDs after they've already
been accepted by the picker.

Also align all gemini-3-flash-preview fallback strings to gemini-2.5-flash
(the real slug) so schema, tabular components, and UI error-fallbacks
agree. gemini-3-flash-preview was a fabricated ID that never existed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant