Skip to content

[Plan 2026-04] AI quality/strength/diversity/production tracking #77

Description

@an0mium

Overview

Tracking issue for the plan at docs/planning/AI_QUALITY_STRENGTH_DIVERSITY_PLAN_2026-04-16.md.

Week 1 issues

Review follow-ups (closed)

Other work shipped along the way

  • A3 training-probe model_version regression test (b714c63)
  • Security sweep 19 → 4 LOW npm findings (30c73b6)
  • v5-heavy compatibility fixes: bootstrap (c9a4302) + runtime infer (5764c26)
  • Test-stub unblock after game-granular resume (bcbf395)
  • Evidence-artifact automation (787f4a2 by Codex)

Week 2–3 next

  • B2 v5-heavy pilot — running on gh200-11 now (iter 1 selfplay post-fix)
  • C1 Ensemble serving for D9–D10 tiers
  • C3 Varied multiplayer seating — leverages the personaIds[] array C2 shipped
  • D2 Hot reload for new checkpoints
  • B3 Seat-stratified value loss (if gh200-12 iter 26 seat_wr confirms imbalance)

Success metrics

  • At least one config above 2000 Elo within 8 weeks
  • square8_3p above 1600 Elo within 4 weeks
  • Production serves ≥ 4 distinguishable personas — ready via C2; pending flag flip
  • p95 inference latency within per-tier SLO
  • Fallback rate < 1% under normal operation (baseline observable via D5 telemetry)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions