feat(routing): syftr-pattern Pareto search over debate-workflow configs by scarmani · Pull Request #8496 · synaptent/aragora

scarmani · 2026-06-17T12:36:40Z

What

A dependency-free, syftr-inspired Pareto search over debate-workflow configurations (rounds × consensus mode × reviewing model-family set), complementing the existing provider-level pareto_frontier in cost_quality_optimizer.

Why

Directly serves the predictable-cost mandate: surfaces the cost/quality/latency-optimal debate configs so the cheap path (e.g. claude + deepseek, 1 round, majority) is used when it suffices and premium configs are reserved for high-stakes decisions.

Design

No new deps — bounded enumeration of the small discrete config space + 3-objective non-domination, instead of Optuna/MOTPE (syftr's approach).
Injectable objective — evaluator: DebateConfig -> ConfigEvaluation keeps the costly, credential-bound debate execution outside the searchable core, so search/frontier/recommend logic is fully unit-testable offline.
Cost-aware — trials are bounded (max_trials, default 6); each trial is a real debate = real spend. Run once, reuse the recommended config.

Follow-up (separate PR)

A live evaluator that runs a real debate and reads cost from billing.cost_tracker, quality from evaluation.llm_judge, latency from wall-clock — wiring the search to production (needs credentials).

Tests

6 passing: domination, tradeoff non-domination, frontier extraction, bounded trials, constraint-aware recommend.

🤖 Generated with Claude Code

Complements the provider-level pareto_frontier (cost_quality_optimizer) by searching the debate *workflow* space — rounds × consensus mode × reviewing family set — for the cost/quality/latency-optimal configs. Inspired by DataRobot's syftr (Pareto-optimized agentic workflows) but dependency-free: a bounded enumeration of the small discrete config space + 3-objective non-domination, instead of Optuna/MOTPE. The objective is injectable (evaluator: DebateConfig -> ConfigEvaluation) so the costly, credential-bound debate execution lives outside the searchable core — the search/frontier/recommend logic is fully unit-testable offline. - DebateSearchSpace (cheapest-leaning defaults, claude+deepseek first) - ConfigEvaluation.dominates (cost↓ quality↑ latency↓), pareto_optimal - search_pareto_configs (bounded max_trials — each trial is a real debate = spend) - SearchResult.recommend (constraint-aware, always returns a usable config) Tests: domination, tradeoff non-domination, frontier extraction, bounded trials, constraint-aware recommend. 6 passing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

scarmani · 2026-06-30T07:17:31Z

Closing this stale draft as part of the queue-drain close path.

Reason: this PR has been idle as a draft since 2026-06-17 and depends on a separate live, credential-bound follow-up before it has an autonomous merge path. Current live checks show no active non-Codex lane owner, no unread steering, no reviews, and no local worktree to preserve. Closing it reduces queue pressure while leaving the work recoverable.

No branch deletion was requested; branch feat/debate-config-pareto-search is preserved for revival if the operator wants to continue this feature outside the drain loop.

Head closed: fec3f26b47307030a1fd27bf811584f07f23c7b6.

This was referenced Jun 17, 2026

[automation] Stage-Gate Conductor Log #7162

Open

[stage-gate] PR drain at 51 open — 6.4× the ≤8 healthy bound #7467

Open

scarmani closed this Jun 30, 2026

scarmani reopened this Jun 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(routing): syftr-pattern Pareto search over debate-workflow configs#8496

feat(routing): syftr-pattern Pareto search over debate-workflow configs#8496
scarmani wants to merge 1 commit into
mainfrom
feat/debate-config-pareto-search

scarmani commented Jun 17, 2026

Uh oh!

scarmani commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

scarmani commented Jun 17, 2026

What

Why

Design

Follow-up (separate PR)

Tests

Uh oh!

scarmani commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant