Skip to content

Enhancement: Use models.dev as LLM model registry source #2904

@yanurag-dev

Description

@yanurag-dev

Summary

Adopt models.dev (https://models.dev) as the primary source for LLM model metadata instead of maintaining a static provider.json file. This would keep the model registry automatically up-to-date with pricing, capabilities, and context limits.

Context

Currently, forgecode maintains a static provider.json file (~3200 lines) with hardcoded model definitions. This requires manual updates when new models are released.

I analyzed how Dexto (another AI coding assistant) handles this - they sync from models.dev at build time.

Current Approach (forgecode)

  • Static crates/forge_repo/src/provider/provider.json embedded at compile time
  • Some providers dynamically fetch models from their APIs at runtime
  • Manual updates required for new models
  • No pricing information

Proposed Approach (use models.dev)

  • Fetch model catalog from https://models.dev/api.json at build time
  • Generate a Rust file similar to Dexto's models.generated.ts
  • Automatically includes:
    • Model IDs and names
    • Context limits (input/output tokens)
    • Pricing (input/output per million tokens, cache pricing, reasoning pricing)
    • Modalities (text, image, audio, video, pdf)
    • Capabilities (reasoning, tool_call, temperature support)
    • Release dates and status

Benefits

  1. Always up-to-date - New models automatically available after running sync script
  2. Rich metadata - Pricing, modalities, and capabilities come free
  3. Less maintenance - No manual updates needed for model list
  4. Cross-provider - Single source for all providers (OpenAI, Anthropic, Google, etc.)

Implementation Suggestion

Create a scripts/sync-llm-registry.ts (or .rs) that:

  1. Fetches https://models.dev/api.json
  2. Filters models by provider-specific rules (e.g., OpenAI → gpt-*, o*)
  3. Generates a Rust file with MODELS_BY_PROVIDER constant
  4. Add to CI to verify registry is up-to-date

Reference

  • Dexto's implementation: scripts/sync-llm-registry.ts
  • Dexto's generated file: packages/core/src/llm/registry/models.generated.ts

Todo

  • Create sync script
  • Generate initial model registry
  • Update existing provider code to use new registry
  • Add CI check to verify registry is up-to-date

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions