From aa79ed78f0d2042ce647da280c839de0134cc092 Mon Sep 17 00:00:00 2001
From: Stephane Segning Lambou <stephane.segning-lambou@adorsys.com>
Date: Fri, 29 May 2026 11:05:02 +0200
Subject: [PATCH] docs: cover the models-info plugin and oauth2 bearer
 propagation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The repo is now a two-plugin workspace, but the top-level README and
docs/ only described @vymalo/opencode-oauth2.

- docs/models-info.md (new): adopter-depth guide for
  @vymalo/opencode-models-info — the single config hook, the
  auth-composition matrix (incl. the oauth2 bearer propagation),
  URL resolution, caching + failure modes, and the log-event reference.
  Links the package README for the full config reference rather than
  duplicating it.
- README.md: add models-info to the Workspace Layout and Documentation
  tables, plus a "Companion plugin: model metadata" section with a
  combined oauth2 + models-info config example.
- docs/architecture.md: note the second plugin up top, document the new
  config-time Authorization propagation (step 6 of the config hook —
  the sole, one-directional coupling point between the plugins), and add
  the two propagation events to the event-name table.

Docs-only; no code or behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md            | 27 ++++++++++++-
 docs/architecture.md |  9 +++++
 docs/models-info.md  | 94 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 docs/models-info.md

diff --git a/README.md b/README.md
index dc4cb52..ad4e05d 100644
--- a/README.md
+++ b/README.md
@@ -82,11 +82,35 @@ See [packages/opencode-oauth2/README.md](packages/opencode-oauth2/README.md) for
 | Page | When you need it |
 | --- | --- |
 | [`docs/architecture.md`](docs/architecture.md) | Understand the hooks, token lifecycle per flow, cache layout, sync scheduler, logging |
+| [`docs/models-info.md`](docs/models-info.md) | The companion metadata-enrichment plugin — how it composes with any auth scheme, caching, failure modes |
 | [`docs/github-actions.md`](docs/github-actions.md) | CI without stored secrets — Keycloak/Auth0/Okta setup, reusable workflow, matrix, fork-PR limits |
 | [`docs/kubernetes.md`](docs/kubernetes.md) | `CronJob` / `Job` / `Deployment` with projected SA tokens, multi-provider pods, RBAC |
 | [`docs/local-development.md`](docs/local-development.md) | Sandbox setup, plugin re-export trick, forcing re-auth, dev-only `env` subject token |
 | [`docs/troubleshooting.md`](docs/troubleshooting.md) | Symptom-keyed fixes — `redirect_uri_mismatch`, model discovery 403, `invalid_client`, projected-token rotation |
 
+## Companion plugin: model metadata
+
+This workspace also ships [`@vymalo/opencode-models-info`](packages/opencode-models-info) — a separate, **auth-agnostic** plugin that enriches your model entries with full metadata (context length, output limit, USD/M-token cost, modalities, and `tool_call` / `reasoning` / `attachment` flags) by fetching from an OpenRouter-shaped `/models` endpoint.
+
+It doesn't depend on this plugin: it runs as a `config` hook *after* other plugins, so it composes with oauth2, static API keys, or no auth at all. When paired with `@vymalo/opencode-oauth2` ≥ 0.4.0, OAuth2-protected metadata endpoints work with zero extra config — this plugin stamps the cached bearer onto the provider's headers at config time, and the metadata fetch inherits it.
+
+```jsonc
+{
+  "plugin": ["@vymalo/opencode-oauth2", "@vymalo/opencode-models-info"],
+  "provider": {
+    "example-ai": {
+      "options": {
+        "baseURL": "https://api.example.com/v1",
+        "oauth2": { "issuer": "https://auth.example.com", "clientId": "opencode-client", "scopes": ["openid", "offline_access"] },
+        "meta": { "modelsInfoUrl": "models" }
+      }
+    }
+  }
+}
+```
+
+Full reference: [`packages/opencode-models-info/README.md`](packages/opencode-models-info/README.md). Behavior, caching, and composition details: [`docs/models-info.md`](docs/models-info.md).
+
 ## Federated identity (CI / Kubernetes)
 
 For GitHub Actions and Kubernetes workloads, use `jwt_bearer` (or `token_exchange`) with the platform's own short-lived OIDC token as the subject. The plugin re-fetches it on every access-token expiry; nothing long-lived gets cached.
@@ -109,7 +133,8 @@ This is a [pnpm](https://pnpm.io) monorepo.
 
 | Package | Purpose |
 | --- | --- |
-| [`packages/opencode-oauth2`](packages/opencode-oauth2) | The runtime plugin — published as `@vymalo/opencode-oauth2` |
+| [`packages/opencode-oauth2`](packages/opencode-oauth2) | OAuth2/OIDC auth + model discovery — published as `@vymalo/opencode-oauth2` |
+| [`packages/opencode-models-info`](packages/opencode-models-info) | Auth-agnostic model **metadata enrichment** — published as `@vymalo/opencode-models-info` |
 | [`packages/plugin-bundle`](packages/plugin-bundle) | Rolldown-based bundling for distribution |
 | [`plans/prd.md`](plans/prd.md) | Product requirements and phased roadmap |
 
diff --git a/docs/architecture.md b/docs/architecture.md
index 9e260f9..75047d5 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -4,6 +4,12 @@ How `@vymalo/opencode-oauth2` actually runs inside OpenCode: what each hook does
 
 If you just want to copy YAML, jump to the [GitHub Actions](./github-actions.md) or [Kubernetes](./kubernetes.md) cookbooks. This page is for the adopter who needs to reason about failure modes.
 
+> The workspace also ships a second, independent plugin —
+> [`@vymalo/opencode-models-info`](./models-info.md) — that enriches model
+> metadata after auth is resolved. It is documented separately; this page
+> covers the oauth2 plugin only, plus the one place the two intersect (the
+> [config-time bearer propagation](#config--plugin-load) in step 6 below).
+
 ## The two hooks
 
 The plugin registers exactly two OpenCode hooks: `config` (plugin load) and `chat.headers` (per request).
@@ -17,6 +23,7 @@ Runs once when OpenCode boots the plugin. Source: [`packages/opencode-oauth2/src
 3. **Build the runtime** (`OAuth2ModelSyncPlugin`), `initialize()` (load cache), then `start({ warmup: true })`.
 4. **Warmup** iterates servers, attempts `syncServer(id, { interactive: <TTY-detected> })`, and starts a per-server scheduler (`syncIntervalMinutes`, default 60).
 5. **Merge discovered models** into each provider's `models` map. If a server has no cached models yet (cold start, non-interactive warmup, refresh-token expired), it stays empty in OpenCode — the user sees no models for that provider until a chat request triggers on-demand auth.
+6. **Propagate the cached bearer.** For each managed provider, if a still-valid cached token exists (30s expiry skew), stamp `options.headers.Authorization = "<tokenType> <accessToken>"` — unless the user already set an `Authorization` header (case-insensitive), which always wins. This makes the token visible to *subsequent* `config` hooks, most notably [`@vymalo/opencode-models-info`](./models-info.md) fetching an OAuth2-protected `meta.modelsInfoUrl`. It's the only coupling point between the two plugins, and it's one-directional and via the shared config object — neither plugin imports the other. A stale value here is harmless: `chat.headers` (below) overwrites per-request with a freshly-ensured token, so the inference call is never affected. Emits `oauth2_bearer_propagated_to_provider_headers` (or `oauth2_bearer_propagation_skipped_user_set`).
 
 The runtime is **rebuilt** if the config signature changes between hook invocations (OpenCode re-runs `config` on certain config edits). Old schedulers are stopped first.
 
@@ -258,6 +265,8 @@ Anywhere the plugin logs a URL it ran (`tokenEndpoint`, `modelsUrl`), it goes th
 | `oauth_open_browser_failed` | `xdg-open`/`open`/`start` failed | `error` (URL goes to stderr separately) |
 | `model_discovery_error_body` | `/v1/models` returned non-2xx | `modelsUrl`, `status`, `bodyPreview` |
 | `model_discovery_empty` | `/v1/models` returned 0 models | `modelsUrl` |
+| `oauth2_bearer_propagated_to_provider_headers` | cached bearer stamped onto `options.headers` (config step 6) | `providerId` |
+| `oauth2_bearer_propagation_skipped_user_set` | skipped — user already set `Authorization` | `providerId` |
 
 When OpenCode is the host, the plugin pipes everything through `client.app.log()` *in addition* to stderr (best-effort, non-blocking). Stderr is the reliable channel.
 
diff --git a/docs/models-info.md b/docs/models-info.md
new file mode 100644
index 0000000..0b420d1
--- /dev/null
+++ b/docs/models-info.md
@@ -0,0 +1,94 @@
+# Model metadata enrichment
+
+How `@vymalo/opencode-models-info` runs inside OpenCode: the single hook it registers, how it composes with any auth scheme, where it caches, and what happens when the metadata endpoint misbehaves.
+
+For the copy-paste config reference (every option, the full OpenRouter→OpenCode field-mapping table), see the package README: [`packages/opencode-models-info/README.md`](../packages/opencode-models-info/README.md). This page is for the adopter who needs to reason about composition and failure modes. The original design rationale lives in [`plans/models-info-plan.md`](../plans/models-info-plan.md).
+
+## What it does
+
+OpenCode supports rich per-model metadata — context window, output limit, USD-per-1M-token cost, and `tool_call` / `reasoning` / `attachment` capability flags — but you normally hand-write it in `opencode.json`. If your provider exposes an OpenRouter-shaped `/models` endpoint, this plugin fetches it once, merges the metadata onto your model entries, caches the result, and stays out of the way.
+
+It is **auth-agnostic** and does **not** depend on `@vymalo/opencode-oauth2`. It only mutates the already-assembled OpenCode config, so it works with static API keys, oauth2, or no auth at all.
+
+## The one hook
+
+The plugin registers a single OpenCode hook: `config` (plugin load). Source: [`packages/opencode-models-info/src/opencode.ts`](../packages/opencode-models-info/src/opencode.ts).
+
+Because the host runs every plugin's `config` hook in registration order, by the time this one fires, other plugins (oauth2, or your static config) have already populated `config.provider[*]` — including `options.headers`. The hook then, for every provider:
+
+1. **Opts in or skips.** Reads `options.meta.modelsInfoUrl`. No URL → the provider is left untouched. Safe to enable globally.
+2. **Resolves the URL** against `options.baseURL` (see [URL resolution](#url-resolution)).
+3. **Loads the catalog** — from the on-disk cache if fresh, otherwise fetches (see [Caching](#caching-and-failure-modes)).
+4. **Merges** derived metadata onto each model whose `id` (or declared `id`) matches an entry in the catalog. The merge is **upstream-wins**: any field already set on the model entry is never overwritten. Running the hook twice is a no-op.
+
+Providers run in parallel (`Promise.allSettled`); one bad endpoint never blocks another's enrichment, and any unexpected throw is surfaced as a `models_info_enrichment_failed` log event rather than silently swallowed.
+
+## Auth composition
+
+The fetch sends the union of the provider's `options.headers` and the meta-specific `meta.modelsInfoHeaders` (meta wins on conflict). That single rule covers the three common setups:
+
+| Setup | What you do |
+| --- | --- |
+| **Public metadata endpoint** (e.g. OpenRouter's `/models`) | Nothing — no auth needed. |
+| **Static API key** | Put the `Bearer` in `options.headers` once; both inference and the metadata fetch use it. |
+| **OAuth2 via `@vymalo/opencode-oauth2` ≥ 0.4.0** | Nothing — that plugin stamps the cached bearer onto `options.headers.Authorization` at config time (see [architecture.md](./architecture.md#config--plugin-load)), so this plugin inherits it automatically. |
+
+If the metadata endpoint needs a *different* credential than inference (e.g. a service-account token), set `meta.modelsInfoHeaders.Authorization` — it overrides whatever the provider carries.
+
+> **Why this works with oauth2 without coupling.** The two plugins never import each other. oauth2 writes its token into the shared, already-resolved provider config; this plugin reads whatever is there. The oauth2 `chat.headers` hook still injects a freshly-refreshed token per chat request, so a slightly-stale config-time header can only ever affect *this* plugin's metadata fetch — never the actual inference call.
+
+## URL resolution
+
+`meta.modelsInfoUrl` resolves against `options.baseURL` with standard WHATWG URL semantics:
+
+| `baseURL` | `modelsInfoUrl` | Resolves to | Use when |
+| --- | --- | --- | --- |
+| `https://x.test/v1` | `models/info` | `https://x.test/v1/models/info` | metadata sits under the inference path |
+| `https://x.test/v1` | `/models/info` | `https://x.test/models/info` | metadata sits at a different path on the same host |
+| `https://x.test/v1` | `https://o.test/m` | `https://o.test/m` | metadata lives on a different host entirely |
+
+Rule of thumb: **drop the leading `/`** to keep the metadata path under your API path; **keep the leading `/`** to escape to the host root.
+
+## Caching and failure modes
+
+The catalog is cached on disk so repeated boots don't re-hit the network.
+
+- **Location** — per-OS cache dir under the `opencode-models-info` namespace: `~/Library/Caches/opencode-models-info/` (macOS), `${XDG_CACHE_HOME:-~/.cache}/opencode-models-info/` (Linux), `%LOCALAPPDATA%\opencode-models-info\` (Windows). Files are `0o600`, written via atomic rename.
+- **Key** — `sha256(providerId :: resolvedUrl :: modelsInfoHeaders)`. The user-set `meta.modelsInfoHeaders` are part of the key (switching an `x-tenant` selector busts the cache), but the provider's other headers are **not** — a rotating OAuth2 bearer must not thrash the cache.
+- **TTL** — `meta.modelsInfoTtlSeconds`, default 24h. The current config TTL is applied on every write, including `304` revalidations, so tightening it in `opencode.json` takes effect on the next revalidation.
+- **Revalidation** — the stored `ETag` is sent as `If-None-Match`; a `304` reuses the cached models and just bumps `fetchedAt`.
+
+Failure handling is deliberately non-fatal — the plugin must never block OpenCode startup:
+
+| Situation | Behavior |
+| --- | --- |
+| Fetch fails (network, timeout, non-2xx) **with** a cached snapshot | Serve the **stale** snapshot; log `models_info_fetch_failed_using_stale`. |
+| Fetch fails **without** any cache | Skip enrichment for that provider; log `models_info_fetch_failed_no_cache`. |
+| Response is malformed (non-empty body that filters down to zero valid entries) | Treated as a parse error → falls back to stale cache, **never** overwrites good data with `[]`. |
+| Disk cache write fails (read-only `$HOME`, etc.) | Best-effort: log `models_info_cache_write_failed` and still enrich from the freshly-fetched in-memory record. |
+
+Per-fetch timeout defaults to 5s (`meta.modelsInfoTimeoutMs`).
+
+## Log events
+
+All structured, `snake_case`, emitted through both the JSON console and OpenCode's `client.app.log`:
+
+| Event | Level | Meaning |
+| --- | --- | --- |
+| `models_info_enriched` | info | A provider's models were enriched (`enrichedCount` / `totalModels` / `sourceModels`). |
+| `models_info_fetched` | info | A live fetch succeeded and the cache was written. |
+| `models_info_cache_hit` | debug | Served from a fresh cache entry; no network. |
+| `models_info_not_modified` | debug | `304` revalidation; cached models reused. |
+| `models_info_fetch_failed_using_stale` | warn | Fetch failed; stale cache served. |
+| `models_info_fetch_failed_no_cache` | warn | Fetch failed and nothing cached; provider left un-enriched. |
+| `models_info_cache_write_failed` | warn | Disk write failed; enrichment proceeded from memory. |
+| `models_info_enrichment_failed` | error | Unexpected throw while enriching a provider. |
+
+## Field mapping (summary)
+
+The exact conversions live in [`packages/opencode-models-info/src/mapping.ts`](../packages/opencode-models-info/src/mapping.ts) and the full table is in the package README. Highlights worth knowing:
+
+- OpenRouter `pricing.prompt` / `.completion` are **USD per token** strings; OpenCode `cost.input` / `.output` are **USD per 1M tokens** numbers — converted (`× 1_000_000`, rounded to 6 dp).
+- `limit` is only emitted when **both** `context` and `output` are known (OpenCode rejects a partial `limit`).
+- Modalities are filtered to OpenCode's enum (`text | audio | image | video | pdf`); a non-text input modality also sets `attachment: true`.
+- `tool_call` / `reasoning` / `temperature` are derived from `supported_parameters`.