Add model-router skill for cost-aware LLM selection#268
Draft
juanmichelini wants to merge 1 commit into
Draft
Conversation
Adds a small skill that maps task categories (research, bug fixing, planning, frontend, testing, bulk repetitive work) to the most cost-efficient LLM according to the public OpenHands Index benchmark (https://index.openhands.dev). For each category the skill recommends a cost pick (best score-per- dollar on the Pareto frontier), a balanced pick, and a premium pick, along with usage heuristics and links to the per-category leaderboard. Co-authored-by: openhands <openhands@all-hands.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a small new skill,
model-router, that recommends the most cost-efficient LLM for a given task category, using benchmark data from the public OpenHands Index.Inspired by the user observation that "Gemini is better for research cost-wise, Opus 4.7 for planning/analysis, DeepSeek Reasoner for heavy repetitive lifting, etc." This skill encodes those tradeoffs as a reusable lookup table the agent can consult when picking a model, configuring a sub-agent, or delegating a cloud conversation.
What the skill provides
For each OpenHands Index category, it surfaces a cost pick (best score-per-dollar on the Pareto frontier), a balanced pick, and a premium pick:
Numbers are average cost per problem (USD) and aggregate score from index.openhands.dev as of the May 2026 snapshot.
The skill also includes:
Files
skills/model-router/SKILL.md- skill body with progressive-disclosure description, decision table, and heuristics.skills/model-router/README.md- human-facing notes.marketplaces/openhands-extensions.json- new catalog entry underproductivity.README.md- auto-regenerated catalog section (viascripts/sync_extensions.py).Validation
python scripts/sync_extensions.py --check->All extensions in sync. ✓pytest tests/test_catalogs.py tests/test_skills_have_readme.py tests/test_sync_extensions.py-> 38 passed.Triggers
Keyword triggers only (no slash command, since this is reference content rather than a workflow):
which model,model selection,pick a model,model router,cost efficient model,cheapest model,best model for.This pull request was created by an AI agent (OpenHands) on behalf of the user.
@juanmichelini can click here to continue refining the PR