From aa79ed78f0d2042ce647da280c839de0134cc092 Mon Sep 17 00:00:00 2001 From: Stephane Segning Lambou Date: Fri, 29 May 2026 11:05:02 +0200 Subject: [PATCH] docs: cover the models-info plugin and oauth2 bearer propagation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The repo is now a two-plugin workspace, but the top-level README and docs/ only described @vymalo/opencode-oauth2. - docs/models-info.md (new): adopter-depth guide for @vymalo/opencode-models-info — the single config hook, the auth-composition matrix (incl. the oauth2 bearer propagation), URL resolution, caching + failure modes, and the log-event reference. Links the package README for the full config reference rather than duplicating it. - README.md: add models-info to the Workspace Layout and Documentation tables, plus a "Companion plugin: model metadata" section with a combined oauth2 + models-info config example. - docs/architecture.md: note the second plugin up top, document the new config-time Authorization propagation (step 6 of the config hook — the sole, one-directional coupling point between the plugins), and add the two propagation events to the event-name table. Docs-only; no code or behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 27 ++++++++++++- docs/architecture.md | 9 +++++ docs/models-info.md | 94 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 docs/models-info.md diff --git a/README.md b/README.md index dc4cb52..ad4e05d 100644 --- a/README.md +++ b/README.md @@ -82,11 +82,35 @@ See [packages/opencode-oauth2/README.md](packages/opencode-oauth2/README.md) for | Page | When you need it | | --- | --- | | [`docs/architecture.md`](docs/architecture.md) | Understand the hooks, token lifecycle per flow, cache layout, sync scheduler, logging | +| [`docs/models-info.md`](docs/models-info.md) | The companion metadata-enrichment plugin — how it composes with any auth scheme, caching, failure modes | | [`docs/github-actions.md`](docs/github-actions.md) | CI without stored secrets — Keycloak/Auth0/Okta setup, reusable workflow, matrix, fork-PR limits | | [`docs/kubernetes.md`](docs/kubernetes.md) | `CronJob` / `Job` / `Deployment` with projected SA tokens, multi-provider pods, RBAC | | [`docs/local-development.md`](docs/local-development.md) | Sandbox setup, plugin re-export trick, forcing re-auth, dev-only `env` subject token | | [`docs/troubleshooting.md`](docs/troubleshooting.md) | Symptom-keyed fixes — `redirect_uri_mismatch`, model discovery 403, `invalid_client`, projected-token rotation | +## Companion plugin: model metadata + +This workspace also ships [`@vymalo/opencode-models-info`](packages/opencode-models-info) — a separate, **auth-agnostic** plugin that enriches your model entries with full metadata (context length, output limit, USD/M-token cost, modalities, and `tool_call` / `reasoning` / `attachment` flags) by fetching from an OpenRouter-shaped `/models` endpoint. + +It doesn't depend on this plugin: it runs as a `config` hook *after* other plugins, so it composes with oauth2, static API keys, or no auth at all. When paired with `@vymalo/opencode-oauth2` ≥ 0.4.0, OAuth2-protected metadata endpoints work with zero extra config — this plugin stamps the cached bearer onto the provider's headers at config time, and the metadata fetch inherits it. + +```jsonc +{ + "plugin": ["@vymalo/opencode-oauth2", "@vymalo/opencode-models-info"], + "provider": { + "example-ai": { + "options": { + "baseURL": "https://api.example.com/v1", + "oauth2": { "issuer": "https://auth.example.com", "clientId": "opencode-client", "scopes": ["openid", "offline_access"] }, + "meta": { "modelsInfoUrl": "models" } + } + } + } +} +``` + +Full reference: [`packages/opencode-models-info/README.md`](packages/opencode-models-info/README.md). Behavior, caching, and composition details: [`docs/models-info.md`](docs/models-info.md). + ## Federated identity (CI / Kubernetes) For GitHub Actions and Kubernetes workloads, use `jwt_bearer` (or `token_exchange`) with the platform's own short-lived OIDC token as the subject. The plugin re-fetches it on every access-token expiry; nothing long-lived gets cached. @@ -109,7 +133,8 @@ This is a [pnpm](https://pnpm.io) monorepo. | Package | Purpose | | --- | --- | -| [`packages/opencode-oauth2`](packages/opencode-oauth2) | The runtime plugin — published as `@vymalo/opencode-oauth2` | +| [`packages/opencode-oauth2`](packages/opencode-oauth2) | OAuth2/OIDC auth + model discovery — published as `@vymalo/opencode-oauth2` | +| [`packages/opencode-models-info`](packages/opencode-models-info) | Auth-agnostic model **metadata enrichment** — published as `@vymalo/opencode-models-info` | | [`packages/plugin-bundle`](packages/plugin-bundle) | Rolldown-based bundling for distribution | | [`plans/prd.md`](plans/prd.md) | Product requirements and phased roadmap | diff --git a/docs/architecture.md b/docs/architecture.md index 9e260f9..75047d5 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -4,6 +4,12 @@ How `@vymalo/opencode-oauth2` actually runs inside OpenCode: what each hook does If you just want to copy YAML, jump to the [GitHub Actions](./github-actions.md) or [Kubernetes](./kubernetes.md) cookbooks. This page is for the adopter who needs to reason about failure modes. +> The workspace also ships a second, independent plugin — +> [`@vymalo/opencode-models-info`](./models-info.md) — that enriches model +> metadata after auth is resolved. It is documented separately; this page +> covers the oauth2 plugin only, plus the one place the two intersect (the +> [config-time bearer propagation](#config--plugin-load) in step 6 below). + ## The two hooks The plugin registers exactly two OpenCode hooks: `config` (plugin load) and `chat.headers` (per request). @@ -17,6 +23,7 @@ Runs once when OpenCode boots the plugin. Source: [`packages/opencode-oauth2/src 3. **Build the runtime** (`OAuth2ModelSyncPlugin`), `initialize()` (load cache), then `start({ warmup: true })`. 4. **Warmup** iterates servers, attempts `syncServer(id, { interactive: })`, and starts a per-server scheduler (`syncIntervalMinutes`, default 60). 5. **Merge discovered models** into each provider's `models` map. If a server has no cached models yet (cold start, non-interactive warmup, refresh-token expired), it stays empty in OpenCode — the user sees no models for that provider until a chat request triggers on-demand auth. +6. **Propagate the cached bearer.** For each managed provider, if a still-valid cached token exists (30s expiry skew), stamp `options.headers.Authorization = " "` — unless the user already set an `Authorization` header (case-insensitive), which always wins. This makes the token visible to *subsequent* `config` hooks, most notably [`@vymalo/opencode-models-info`](./models-info.md) fetching an OAuth2-protected `meta.modelsInfoUrl`. It's the only coupling point between the two plugins, and it's one-directional and via the shared config object — neither plugin imports the other. A stale value here is harmless: `chat.headers` (below) overwrites per-request with a freshly-ensured token, so the inference call is never affected. Emits `oauth2_bearer_propagated_to_provider_headers` (or `oauth2_bearer_propagation_skipped_user_set`). The runtime is **rebuilt** if the config signature changes between hook invocations (OpenCode re-runs `config` on certain config edits). Old schedulers are stopped first. @@ -258,6 +265,8 @@ Anywhere the plugin logs a URL it ran (`tokenEndpoint`, `modelsUrl`), it goes th | `oauth_open_browser_failed` | `xdg-open`/`open`/`start` failed | `error` (URL goes to stderr separately) | | `model_discovery_error_body` | `/v1/models` returned non-2xx | `modelsUrl`, `status`, `bodyPreview` | | `model_discovery_empty` | `/v1/models` returned 0 models | `modelsUrl` | +| `oauth2_bearer_propagated_to_provider_headers` | cached bearer stamped onto `options.headers` (config step 6) | `providerId` | +| `oauth2_bearer_propagation_skipped_user_set` | skipped — user already set `Authorization` | `providerId` | When OpenCode is the host, the plugin pipes everything through `client.app.log()` *in addition* to stderr (best-effort, non-blocking). Stderr is the reliable channel. diff --git a/docs/models-info.md b/docs/models-info.md new file mode 100644 index 0000000..0b420d1 --- /dev/null +++ b/docs/models-info.md @@ -0,0 +1,94 @@ +# Model metadata enrichment + +How `@vymalo/opencode-models-info` runs inside OpenCode: the single hook it registers, how it composes with any auth scheme, where it caches, and what happens when the metadata endpoint misbehaves. + +For the copy-paste config reference (every option, the full OpenRouter→OpenCode field-mapping table), see the package README: [`packages/opencode-models-info/README.md`](../packages/opencode-models-info/README.md). This page is for the adopter who needs to reason about composition and failure modes. The original design rationale lives in [`plans/models-info-plan.md`](../plans/models-info-plan.md). + +## What it does + +OpenCode supports rich per-model metadata — context window, output limit, USD-per-1M-token cost, and `tool_call` / `reasoning` / `attachment` capability flags — but you normally hand-write it in `opencode.json`. If your provider exposes an OpenRouter-shaped `/models` endpoint, this plugin fetches it once, merges the metadata onto your model entries, caches the result, and stays out of the way. + +It is **auth-agnostic** and does **not** depend on `@vymalo/opencode-oauth2`. It only mutates the already-assembled OpenCode config, so it works with static API keys, oauth2, or no auth at all. + +## The one hook + +The plugin registers a single OpenCode hook: `config` (plugin load). Source: [`packages/opencode-models-info/src/opencode.ts`](../packages/opencode-models-info/src/opencode.ts). + +Because the host runs every plugin's `config` hook in registration order, by the time this one fires, other plugins (oauth2, or your static config) have already populated `config.provider[*]` — including `options.headers`. The hook then, for every provider: + +1. **Opts in or skips.** Reads `options.meta.modelsInfoUrl`. No URL → the provider is left untouched. Safe to enable globally. +2. **Resolves the URL** against `options.baseURL` (see [URL resolution](#url-resolution)). +3. **Loads the catalog** — from the on-disk cache if fresh, otherwise fetches (see [Caching](#caching-and-failure-modes)). +4. **Merges** derived metadata onto each model whose `id` (or declared `id`) matches an entry in the catalog. The merge is **upstream-wins**: any field already set on the model entry is never overwritten. Running the hook twice is a no-op. + +Providers run in parallel (`Promise.allSettled`); one bad endpoint never blocks another's enrichment, and any unexpected throw is surfaced as a `models_info_enrichment_failed` log event rather than silently swallowed. + +## Auth composition + +The fetch sends the union of the provider's `options.headers` and the meta-specific `meta.modelsInfoHeaders` (meta wins on conflict). That single rule covers the three common setups: + +| Setup | What you do | +| --- | --- | +| **Public metadata endpoint** (e.g. OpenRouter's `/models`) | Nothing — no auth needed. | +| **Static API key** | Put the `Bearer` in `options.headers` once; both inference and the metadata fetch use it. | +| **OAuth2 via `@vymalo/opencode-oauth2` ≥ 0.4.0** | Nothing — that plugin stamps the cached bearer onto `options.headers.Authorization` at config time (see [architecture.md](./architecture.md#config--plugin-load)), so this plugin inherits it automatically. | + +If the metadata endpoint needs a *different* credential than inference (e.g. a service-account token), set `meta.modelsInfoHeaders.Authorization` — it overrides whatever the provider carries. + +> **Why this works with oauth2 without coupling.** The two plugins never import each other. oauth2 writes its token into the shared, already-resolved provider config; this plugin reads whatever is there. The oauth2 `chat.headers` hook still injects a freshly-refreshed token per chat request, so a slightly-stale config-time header can only ever affect *this* plugin's metadata fetch — never the actual inference call. + +## URL resolution + +`meta.modelsInfoUrl` resolves against `options.baseURL` with standard WHATWG URL semantics: + +| `baseURL` | `modelsInfoUrl` | Resolves to | Use when | +| --- | --- | --- | --- | +| `https://x.test/v1` | `models/info` | `https://x.test/v1/models/info` | metadata sits under the inference path | +| `https://x.test/v1` | `/models/info` | `https://x.test/models/info` | metadata sits at a different path on the same host | +| `https://x.test/v1` | `https://o.test/m` | `https://o.test/m` | metadata lives on a different host entirely | + +Rule of thumb: **drop the leading `/`** to keep the metadata path under your API path; **keep the leading `/`** to escape to the host root. + +## Caching and failure modes + +The catalog is cached on disk so repeated boots don't re-hit the network. + +- **Location** — per-OS cache dir under the `opencode-models-info` namespace: `~/Library/Caches/opencode-models-info/` (macOS), `${XDG_CACHE_HOME:-~/.cache}/opencode-models-info/` (Linux), `%LOCALAPPDATA%\opencode-models-info\` (Windows). Files are `0o600`, written via atomic rename. +- **Key** — `sha256(providerId :: resolvedUrl :: modelsInfoHeaders)`. The user-set `meta.modelsInfoHeaders` are part of the key (switching an `x-tenant` selector busts the cache), but the provider's other headers are **not** — a rotating OAuth2 bearer must not thrash the cache. +- **TTL** — `meta.modelsInfoTtlSeconds`, default 24h. The current config TTL is applied on every write, including `304` revalidations, so tightening it in `opencode.json` takes effect on the next revalidation. +- **Revalidation** — the stored `ETag` is sent as `If-None-Match`; a `304` reuses the cached models and just bumps `fetchedAt`. + +Failure handling is deliberately non-fatal — the plugin must never block OpenCode startup: + +| Situation | Behavior | +| --- | --- | +| Fetch fails (network, timeout, non-2xx) **with** a cached snapshot | Serve the **stale** snapshot; log `models_info_fetch_failed_using_stale`. | +| Fetch fails **without** any cache | Skip enrichment for that provider; log `models_info_fetch_failed_no_cache`. | +| Response is malformed (non-empty body that filters down to zero valid entries) | Treated as a parse error → falls back to stale cache, **never** overwrites good data with `[]`. | +| Disk cache write fails (read-only `$HOME`, etc.) | Best-effort: log `models_info_cache_write_failed` and still enrich from the freshly-fetched in-memory record. | + +Per-fetch timeout defaults to 5s (`meta.modelsInfoTimeoutMs`). + +## Log events + +All structured, `snake_case`, emitted through both the JSON console and OpenCode's `client.app.log`: + +| Event | Level | Meaning | +| --- | --- | --- | +| `models_info_enriched` | info | A provider's models were enriched (`enrichedCount` / `totalModels` / `sourceModels`). | +| `models_info_fetched` | info | A live fetch succeeded and the cache was written. | +| `models_info_cache_hit` | debug | Served from a fresh cache entry; no network. | +| `models_info_not_modified` | debug | `304` revalidation; cached models reused. | +| `models_info_fetch_failed_using_stale` | warn | Fetch failed; stale cache served. | +| `models_info_fetch_failed_no_cache` | warn | Fetch failed and nothing cached; provider left un-enriched. | +| `models_info_cache_write_failed` | warn | Disk write failed; enrichment proceeded from memory. | +| `models_info_enrichment_failed` | error | Unexpected throw while enriching a provider. | + +## Field mapping (summary) + +The exact conversions live in [`packages/opencode-models-info/src/mapping.ts`](../packages/opencode-models-info/src/mapping.ts) and the full table is in the package README. Highlights worth knowing: + +- OpenRouter `pricing.prompt` / `.completion` are **USD per token** strings; OpenCode `cost.input` / `.output` are **USD per 1M tokens** numbers — converted (`× 1_000_000`, rounded to 6 dp). +- `limit` is only emitted when **both** `context` and `output` are known (OpenCode rejects a partial `limit`). +- Modalities are filtered to OpenCode's enum (`text | audio | image | video | pdf`); a non-text input modality also sets `attachment: true`. +- `tool_call` / `reasoning` / `temperature` are derived from `supported_parameters`.