Skip to content

refactor(metrics): re-home metrics onto aragora.evaluation with shims (VAL-P4A-012)#8696

Merged
scarmani merged 4 commits into
mainfrom
structex/p4a-metrics-rehome
Jun 29, 2026
Merged

refactor(metrics): re-home metrics onto aragora.evaluation with shims (VAL-P4A-012)#8696
scarmani merged 4 commits into
mainfrom
structex/p4a-metrics-rehome

Conversation

@scarmani

Copy link
Copy Markdown
Collaborator

What

Re-home the aragora/metrics/ originals onto aragora/evaluation/ and leave
aragora/metrics/*.py as DeprecationWarning re-export shims so every legacy
import path keeps working for one release (VAL-P4A-012).

Moved (pure git mv, content unchanged) to aragora/evaluation/:

  • viah.py, viah_signals.py, viah_status.py
  • manifold_brier.py, manifold_brier_bridge.py
  • capability_checkpoint.py

aragora/metrics/ now contains only two-sided re-export shims:

  • each submodule re-exports its __all__ from the matching
    aragora.evaluation.* home and emits a DeprecationWarning whose message
    names the old aragora.metrics.* path
  • the package __init__ emits an aragora.metrics is deprecated warning and
    keeps its lazy PEP 562 __getattr__ (now resolving against
    aragora.evaluation)

Intra-package imports inside the two moved bridge/status modules were repointed
to their new aragora.evaluation.* homes. The only external importer,
aragora/cli/commands/agt_metrics.py, intentionally stays on the shim.

Behavior

  • All documented old import paths (from aragora.metrics import ...,
    from aragora.metrics.viah import ..., etc.) keep working and now emit a
    DeprecationWarning.
  • Re-exported objects are identical to the aragora.evaluation.* originals
    (aragora.metrics.viah.compute_viah is aragora.evaluation.viah.compute_viah).

Validation

VAL-P4A-012 two-sided shim check (every tracked aragora/metrics/*.py):
code-free shim (only __getattr__/__dir__ allowed), re-export restricted to
aragora.evaluation/aragora.metrics, old path import works, new home
importable, package DeprecationWarning contains metrics:

########## SUMMARY FAIL=0 ##########

Local gates: make lint PASS; ruff format --check aragora/ tests/ scripts/
PASS; CI changed-file typecheck (mypy --ignore-missing-imports --follow-imports=skip over the 13 changed files) Success: no issues found;
tests/metrics/ + tests/cli/test_agt_metrics_status.py 234 passed (one
pre-existing failure test_invalid_weeks_raises, confirmed identical on clean
origin/main, unrelated to this move); make test-smoke and the smoke tier
(138 passed) PASS.

Notes / risks

… (VAL-P4A-012)

Move the VIAH, Manifold Brier, and capability-checkpoint originals from
aragora/metrics/ to aragora/evaluation/ and leave aragora/metrics/*.py as
DeprecationWarning re-export shims so the legacy import paths keep working.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@scarmani scarmani marked this pull request as ready for review June 29, 2026 18:02
@scarmani scarmani requested a review from an0mium as a code owner June 29, 2026 18:02
@github-actions

Copy link
Copy Markdown
Contributor

Aragora Code Review

Advisory-only review. No issues found.

…home

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@scarmani

Copy link
Copy Markdown
Collaborator Author

Claude independent model review

Reviewer: claude (anthropic) — independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
Head: 6d69a0d (6d69a0d), committed 2026-06-29T18:12:56Z.
PR: #8696.
Model family: claude

Verdict: PASS

Symbol parity and mechanics check out: every shim re-exports the full public surface of its new aragora.evaluation.* home (viah 18 names, manifold_brier 6, manifold_brier_bridge 5, viah_status 1, capability_checkpoint 8 incl. build_default_registry), all evaluation modules are self-contained (no back-references to aragora.metrics), all 13 files parse, and the moved code is logic-identical to the originals. The package __init__ __getattr__ redirect targets were correctly repointed to aragora.evaluation.* with no change to its __all__. No filterwarnings=error in pytest config, so the added DeprecationWarnings won't break the suite.

  • [P3] aragora/cli/commands/agt_metrics.py:15,71 — first-party live consumer still imports the now-deprecated shims (from aragora.metrics.viah import ... and aragora.metrics.viah_status). Functionally fine (shims re-export correctly), but the project ends up deprecation-warning against itself on its own agt-metrics CLI path; the shims exist for legacy/external callers, so this internal caller should be repointed to aragora.evaluation.viah / aragora.evaluation.viah_status in this PR.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator Author

OpenAI independent model review

Reviewer: openai (openai) — independent adversarial model review via Codex CLI OpenAI harness, grounded on the exact PR head.
Head: 1e0c4c1 (1e0c4c1), committed 2026-06-29T20:01:35Z.
PR: #8696.
Model family: openai

Verdict: PASS

No findings.

dogfood: yes

@scarmani

Copy link
Copy Markdown
Collaborator Author

Claude independent model review

Reviewer: claude (anthropic) — independent adversarial model review via the Aragora Claude reviewer, grounded on the exact PR head.
Head: 1e0c4c1 (1e0c4c1), committed 2026-06-29T20:01:35Z.
PR: #8696.
Model family: claude

Verdict: PASS

This is a clean re-homing refactor: the metric modules (viah*, manifold_brier*, capability_checkpoint) move verbatim to aragora.evaluation.*, and aragora.metrics.* becomes deprecation shims that re-export them. Verification results:

  • aragora/evaluation/__init__.py lazy __getattr__ is correct — all 9 names in __all__ are defined in llm_judge.py; from __future__ import annotations is correctly placed right after the docstring.

  • The new lazy init does what it claims: importing aragora.evaluation.viah does not pull in aragora.evaluation.llm_judge (confirmed at runtime → the new test_viah_import_does_not_load_llm_judge_stack passes).

  • All shim re-exports are complete (viah.py exports ViahTrend/persist_viah_snapshot/read_viah_snapshots/etc.; capability_checkpoint exports build_default_registry; package-level __getattr__ resolves compute_viah/ManifoldBrierScorer). Confirmed by importing each.

  • No remaining production importers of aragora.metrics outside the shims themselves, and no filterwarnings = error in pytest config — so the new module-level DeprecationWarning won't break CI. The pattern matches existing in-repo shims (type_protocols, redis_config).

  • The removed assert report.window_start.endswith("Z") line in test_viah.py was unreachable/dead (after a pytest.raises block, referencing an out-of-scope report); removing it is correct.

  • [P3] aragora/metrics/*.py (all shim files) — the legacy aragora.metrics.* import paths have no remaining test coverage in this PR (tests were migrated to aragora.evaluation), so a future typo in a shim re-export list would not be caught by CI before the "one release" back-compat window closes. I confirmed they all import/re-export correctly today; consider a tiny smoke test asserting the shim paths still resolve.

dogfood: yes

@scarmani scarmani merged commit 582e714 into main Jun 29, 2026
94 of 99 checks passed
@scarmani scarmani deleted the structex/p4a-metrics-rehome branch June 29, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant