Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,35 @@ See [packages/opencode-oauth2/README.md](packages/opencode-oauth2/README.md) for
| Page | When you need it |
| --- | --- |
| [`docs/architecture.md`](docs/architecture.md) | Understand the hooks, token lifecycle per flow, cache layout, sync scheduler, logging |
| [`docs/models-info.md`](docs/models-info.md) | The companion metadata-enrichment plugin — how it composes with any auth scheme, caching, failure modes |
| [`docs/github-actions.md`](docs/github-actions.md) | CI without stored secrets — Keycloak/Auth0/Okta setup, reusable workflow, matrix, fork-PR limits |
| [`docs/kubernetes.md`](docs/kubernetes.md) | `CronJob` / `Job` / `Deployment` with projected SA tokens, multi-provider pods, RBAC |
| [`docs/local-development.md`](docs/local-development.md) | Sandbox setup, plugin re-export trick, forcing re-auth, dev-only `env` subject token |
| [`docs/troubleshooting.md`](docs/troubleshooting.md) | Symptom-keyed fixes — `redirect_uri_mismatch`, model discovery 403, `invalid_client`, projected-token rotation |

## Companion plugin: model metadata

This workspace also ships [`@vymalo/opencode-models-info`](packages/opencode-models-info) — a separate, **auth-agnostic** plugin that enriches your model entries with full metadata (context length, output limit, USD/M-token cost, modalities, and `tool_call` / `reasoning` / `attachment` flags) by fetching from an OpenRouter-shaped `/models` endpoint.

It doesn't depend on this plugin: it runs as a `config` hook *after* other plugins, so it composes with oauth2, static API keys, or no auth at all. When paired with `@vymalo/opencode-oauth2` ≥ 0.4.0, OAuth2-protected metadata endpoints work with zero extra config — this plugin stamps the cached bearer onto the provider's headers at config time, and the metadata fetch inherits it.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In this workspace-level README.md, using the phrase "this plugin" twice in a section dedicated to the companion plugin (@vymalo/opencode-models-info) is ambiguous and confusing. It can easily lead readers to believe that the companion plugin is the one performing the token stamping, whereas it is actually @vymalo/opencode-oauth2 that does so. Specifying the plugin names explicitly resolves this ambiguity.

Suggested change
It doesn't depend on this plugin: it runs as a `config` hook *after* other plugins, so it composes with oauth2, static API keys, or no auth at all. When paired with `@vymalo/opencode-oauth2` ≥ 0.4.0, OAuth2-protected metadata endpoints work with zero extra config — this plugin stamps the cached bearer onto the provider's headers at config time, and the metadata fetch inherits it.
It doesn't depend on `@vymalo/opencode-oauth2`: it runs as a `config` hook *after* other plugins, so it composes with oauth2, static API keys, or no auth at all. When paired with `@vymalo/opencode-oauth2` ≥ 0.4.0, OAuth2-protected metadata endpoints work with zero extra config — the oauth2 plugin stamps the cached bearer onto the provider's headers at config time, and the metadata fetch inherits it.


```jsonc
{
"plugin": ["@vymalo/opencode-oauth2", "@vymalo/opencode-models-info"],
"provider": {
"example-ai": {
"options": {
"baseURL": "https://api.example.com/v1",
"oauth2": { "issuer": "https://auth.example.com", "clientId": "opencode-client", "scopes": ["openid", "offline_access"] },
"meta": { "modelsInfoUrl": "models" }
}
}
}
}
```

Full reference: [`packages/opencode-models-info/README.md`](packages/opencode-models-info/README.md). Behavior, caching, and composition details: [`docs/models-info.md`](docs/models-info.md).

## Federated identity (CI / Kubernetes)

For GitHub Actions and Kubernetes workloads, use `jwt_bearer` (or `token_exchange`) with the platform's own short-lived OIDC token as the subject. The plugin re-fetches it on every access-token expiry; nothing long-lived gets cached.
Expand All @@ -109,7 +133,8 @@ This is a [pnpm](https://pnpm.io) monorepo.

| Package | Purpose |
| --- | --- |
| [`packages/opencode-oauth2`](packages/opencode-oauth2) | The runtime plugin — published as `@vymalo/opencode-oauth2` |
| [`packages/opencode-oauth2`](packages/opencode-oauth2) | OAuth2/OIDC auth + model discovery — published as `@vymalo/opencode-oauth2` |
| [`packages/opencode-models-info`](packages/opencode-models-info) | Auth-agnostic model **metadata enrichment** — published as `@vymalo/opencode-models-info` |
| [`packages/plugin-bundle`](packages/plugin-bundle) | Rolldown-based bundling for distribution |
| [`plans/prd.md`](plans/prd.md) | Product requirements and phased roadmap |

Expand Down
9 changes: 9 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ How `@vymalo/opencode-oauth2` actually runs inside OpenCode: what each hook does

If you just want to copy YAML, jump to the [GitHub Actions](./github-actions.md) or [Kubernetes](./kubernetes.md) cookbooks. This page is for the adopter who needs to reason about failure modes.

> The workspace also ships a second, independent plugin —
> [`@vymalo/opencode-models-info`](./models-info.md) — that enriches model
> metadata after auth is resolved. It is documented separately; this page
> covers the oauth2 plugin only, plus the one place the two intersect (the
> [config-time bearer propagation](#config--plugin-load) in step 6 below).

## The two hooks

The plugin registers exactly two OpenCode hooks: `config` (plugin load) and `chat.headers` (per request).
Expand All @@ -17,6 +23,7 @@ Runs once when OpenCode boots the plugin. Source: [`packages/opencode-oauth2/src
3. **Build the runtime** (`OAuth2ModelSyncPlugin`), `initialize()` (load cache), then `start({ warmup: true })`.
4. **Warmup** iterates servers, attempts `syncServer(id, { interactive: <TTY-detected> })`, and starts a per-server scheduler (`syncIntervalMinutes`, default 60).
5. **Merge discovered models** into each provider's `models` map. If a server has no cached models yet (cold start, non-interactive warmup, refresh-token expired), it stays empty in OpenCode — the user sees no models for that provider until a chat request triggers on-demand auth.
6. **Propagate the cached bearer.** For each managed provider, if a still-valid cached token exists (30s expiry skew), stamp `options.headers.Authorization = "<tokenType> <accessToken>"` — unless the user already set an `Authorization` header (case-insensitive), which always wins. This makes the token visible to *subsequent* `config` hooks, most notably [`@vymalo/opencode-models-info`](./models-info.md) fetching an OAuth2-protected `meta.modelsInfoUrl`. It's the only coupling point between the two plugins, and it's one-directional and via the shared config object — neither plugin imports the other. A stale value here is harmless: `chat.headers` (below) overwrites per-request with a freshly-ensured token, so the inference call is never affected. Emits `oauth2_bearer_propagated_to_provider_headers` (or `oauth2_bearer_propagation_skipped_user_set`).

The runtime is **rebuilt** if the config signature changes between hook invocations (OpenCode re-runs `config` on certain config edits). Old schedulers are stopped first.

Expand Down Expand Up @@ -258,6 +265,8 @@ Anywhere the plugin logs a URL it ran (`tokenEndpoint`, `modelsUrl`), it goes th
| `oauth_open_browser_failed` | `xdg-open`/`open`/`start` failed | `error` (URL goes to stderr separately) |
| `model_discovery_error_body` | `/v1/models` returned non-2xx | `modelsUrl`, `status`, `bodyPreview` |
| `model_discovery_empty` | `/v1/models` returned 0 models | `modelsUrl` |
| `oauth2_bearer_propagated_to_provider_headers` | cached bearer stamped onto `options.headers` (config step 6) | `providerId` |
| `oauth2_bearer_propagation_skipped_user_set` | skipped — user already set `Authorization` | `providerId` |

When OpenCode is the host, the plugin pipes everything through `client.app.log()` *in addition* to stderr (best-effort, non-blocking). Stderr is the reliable channel.

Expand Down
94 changes: 94 additions & 0 deletions docs/models-info.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Model metadata enrichment

How `@vymalo/opencode-models-info` runs inside OpenCode: the single hook it registers, how it composes with any auth scheme, where it caches, and what happens when the metadata endpoint misbehaves.

For the copy-paste config reference (every option, the full OpenRouter→OpenCode field-mapping table), see the package README: [`packages/opencode-models-info/README.md`](../packages/opencode-models-info/README.md). This page is for the adopter who needs to reason about composition and failure modes. The original design rationale lives in [`plans/models-info-plan.md`](../plans/models-info-plan.md).

## What it does

OpenCode supports rich per-model metadata — context window, output limit, USD-per-1M-token cost, and `tool_call` / `reasoning` / `attachment` capability flags — but you normally hand-write it in `opencode.json`. If your provider exposes an OpenRouter-shaped `/models` endpoint, this plugin fetches it once, merges the metadata onto your model entries, caches the result, and stays out of the way.

It is **auth-agnostic** and does **not** depend on `@vymalo/opencode-oauth2`. It only mutates the already-assembled OpenCode config, so it works with static API keys, oauth2, or no auth at all.

## The one hook

The plugin registers a single OpenCode hook: `config` (plugin load). Source: [`packages/opencode-models-info/src/opencode.ts`](../packages/opencode-models-info/src/opencode.ts).

Because the host runs every plugin's `config` hook in registration order, by the time this one fires, other plugins (oauth2, or your static config) have already populated `config.provider[*]` — including `options.headers`. The hook then, for every provider:

1. **Opts in or skips.** Reads `options.meta.modelsInfoUrl`. No URL → the provider is left untouched. Safe to enable globally.
2. **Resolves the URL** against `options.baseURL` (see [URL resolution](#url-resolution)).
3. **Loads the catalog** — from the on-disk cache if fresh, otherwise fetches (see [Caching](#caching-and-failure-modes)).
4. **Merges** derived metadata onto each model whose `id` (or declared `id`) matches an entry in the catalog. The merge is **upstream-wins**: any field already set on the model entry is never overwritten. Running the hook twice is a no-op.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The term upstream-wins is contradictory here. In network and API contexts, "upstream" refers to the remote server/API. Since the behavior described is that "any field already set on the model entry is never overwritten" (meaning the local/existing configuration takes precedence over the fetched remote metadata), this is actually an existing-wins or local-wins merge strategy. Using "upstream-wins" will confuse readers into thinking the remote API's values will overwrite their local overrides.

Suggested change
4. **Merges** derived metadata onto each model whose `id` (or declared `id`) matches an entry in the catalog. The merge is **upstream-wins**: any field already set on the model entry is never overwritten. Running the hook twice is a no-op.
4. **Merges** derived metadata onto each model whose `id` (or declared `id`) matches an entry in the catalog. The merge is **existing-wins** (or **local-wins**): any field already set on the model entry is never overwritten. Running the hook twice is a no-op.


Providers run in parallel (`Promise.allSettled`); one bad endpoint never blocks another's enrichment, and any unexpected throw is surfaced as a `models_info_enrichment_failed` log event rather than silently swallowed.

## Auth composition

The fetch sends the union of the provider's `options.headers` and the meta-specific `meta.modelsInfoHeaders` (meta wins on conflict). That single rule covers the three common setups:

| Setup | What you do |
| --- | --- |
| **Public metadata endpoint** (e.g. OpenRouter's `/models`) | Nothing — no auth needed. |
| **Static API key** | Put the `Bearer` in `options.headers` once; both inference and the metadata fetch use it. |
| **OAuth2 via `@vymalo/opencode-oauth2` ≥ 0.4.0** | Nothing — that plugin stamps the cached bearer onto `options.headers.Authorization` at config time (see [architecture.md](./architecture.md#config--plugin-load)), so this plugin inherits it automatically. |

If the metadata endpoint needs a *different* credential than inference (e.g. a service-account token), set `meta.modelsInfoHeaders.Authorization` — it overrides whatever the provider carries.

> **Why this works with oauth2 without coupling.** The two plugins never import each other. oauth2 writes its token into the shared, already-resolved provider config; this plugin reads whatever is there. The oauth2 `chat.headers` hook still injects a freshly-refreshed token per chat request, so a slightly-stale config-time header can only ever affect *this* plugin's metadata fetch — never the actual inference call.

## URL resolution

`meta.modelsInfoUrl` resolves against `options.baseURL` with standard WHATWG URL semantics:

| `baseURL` | `modelsInfoUrl` | Resolves to | Use when |
| --- | --- | --- | --- |
| `https://x.test/v1` | `models/info` | `https://x.test/v1/models/info` | metadata sits under the inference path |
| `https://x.test/v1` | `/models/info` | `https://x.test/models/info` | metadata sits at a different path on the same host |
| `https://x.test/v1` | `https://o.test/m` | `https://o.test/m` | metadata lives on a different host entirely |

Rule of thumb: **drop the leading `/`** to keep the metadata path under your API path; **keep the leading `/`** to escape to the host root.

## Caching and failure modes

The catalog is cached on disk so repeated boots don't re-hit the network.

- **Location** — per-OS cache dir under the `opencode-models-info` namespace: `~/Library/Caches/opencode-models-info/` (macOS), `${XDG_CACHE_HOME:-~/.cache}/opencode-models-info/` (Linux), `%LOCALAPPDATA%\opencode-models-info\` (Windows). Files are `0o600`, written via atomic rename.
- **Key** — `sha256(providerId :: resolvedUrl :: modelsInfoHeaders)`. The user-set `meta.modelsInfoHeaders` are part of the key (switching an `x-tenant` selector busts the cache), but the provider's other headers are **not** — a rotating OAuth2 bearer must not thrash the cache.
- **TTL** — `meta.modelsInfoTtlSeconds`, default 24h. The current config TTL is applied on every write, including `304` revalidations, so tightening it in `opencode.json` takes effect on the next revalidation.
- **Revalidation** — the stored `ETag` is sent as `If-None-Match`; a `304` reuses the cached models and just bumps `fetchedAt`.

Failure handling is deliberately non-fatal — the plugin must never block OpenCode startup:

| Situation | Behavior |
| --- | --- |
| Fetch fails (network, timeout, non-2xx) **with** a cached snapshot | Serve the **stale** snapshot; log `models_info_fetch_failed_using_stale`. |
| Fetch fails **without** any cache | Skip enrichment for that provider; log `models_info_fetch_failed_no_cache`. |
| Response is malformed (non-empty body that filters down to zero valid entries) | Treated as a parse error → falls back to stale cache, **never** overwrites good data with `[]`. |
| Disk cache write fails (read-only `$HOME`, etc.) | Best-effort: log `models_info_cache_write_failed` and still enrich from the freshly-fetched in-memory record. |

Per-fetch timeout defaults to 5s (`meta.modelsInfoTimeoutMs`).

## Log events

All structured, `snake_case`, emitted through both the JSON console and OpenCode's `client.app.log`:

| Event | Level | Meaning |
| --- | --- | --- |
| `models_info_enriched` | info | A provider's models were enriched (`enrichedCount` / `totalModels` / `sourceModels`). |
| `models_info_fetched` | info | A live fetch succeeded and the cache was written. |
| `models_info_cache_hit` | debug | Served from a fresh cache entry; no network. |
| `models_info_not_modified` | debug | `304` revalidation; cached models reused. |
| `models_info_fetch_failed_using_stale` | warn | Fetch failed; stale cache served. |
| `models_info_fetch_failed_no_cache` | warn | Fetch failed and nothing cached; provider left un-enriched. |
| `models_info_cache_write_failed` | warn | Disk write failed; enrichment proceeded from memory. |
| `models_info_enrichment_failed` | error | Unexpected throw while enriching a provider. |

## Field mapping (summary)

The exact conversions live in [`packages/opencode-models-info/src/mapping.ts`](../packages/opencode-models-info/src/mapping.ts) and the full table is in the package README. Highlights worth knowing:

- OpenRouter `pricing.prompt` / `.completion` are **USD per token** strings; OpenCode `cost.input` / `.output` are **USD per 1M tokens** numbers — converted (`× 1_000_000`, rounded to 6 dp).
- `limit` is only emitted when **both** `context` and `output` are known (OpenCode rejects a partial `limit`).
- Modalities are filtered to OpenCode's enum (`text | audio | image | video | pdf`); a non-text input modality also sets `attachment: true`.
- `tool_call` / `reasoning` / `temperature` are derived from `supported_parameters`.