fix: close risk register concerns C-16, C-17, C-18 by Polichinel · Pull Request #17 · views-platform/views-evaluation

Polichinel · 2026-04-04T00:11:11Z

Summary

C-16: Wrap metric function calls in _calculate_metrics() with try/except that re-raises as ValueError naming the metric, task, and pred_type
C-17: Replace hardcoded max_allowed_step=999 with float('inf') so steps >= 1000 are not silently dropped
C-18: Add bounds validation in resolve_metric_params() for alpha, quantile, lower_quantile, upper_quantile — all must be in (0, 1). Cross-validation for QIS quantile ordering

Also: update CICs (MetricCatalog, NativeEvaluator) and ADRs (011, 014) with Known Deviations sections documenting C-02 and C-05. Close C-14 (stale editable install metadata). Upgrade C-02 from Tier 3 to Tier 2. Risk register restructured (closed entries moved to Closed table).

Test plan

9 new TDD tests (wrote failing first, then fixed)
240 total tests passing
0 lint errors (ruff)
validate_docs.sh passes
review-diff: CLEAN (0 critical, 0 warning after fixes)

🤖 Generated with Claude Code

…ntext, step sentinel Address C-16, C-17, C-18 identified by risk register review with TDD: - C-16: Wrap metric function calls in _calculate_metrics() with try/except that re-raises as ValueError naming the metric, task, and pred_type - C-17: Replace hardcoded max_allowed_step=999 with float('inf') so steps >= 1000 are not silently dropped - C-18: Add bounds validation in resolve_metric_params() for alpha, quantile, lower_quantile, upper_quantile — all must be in (0, 1). Cross-validation for QIS lower_quantile < upper_quantile Also: update CICs (MetricCatalog, NativeEvaluator) and ADRs (011, 014) with Known Deviations sections documenting C-02 and C-05. Close C-14 (stale editable install metadata). Upgrade C-02 from Tier 3 to Tier 2. 9 new tests, 240 total passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…s-schema consistency, C-10 - Add object-dtype rejection to EvaluationFrame._validate() (ADR-011 Pure NumPy contract) - Remove 22 lines of dead pandas/object-dtype branches from _guard_shapes (closes C-10) - Add 5 new tests: object-dtype rejection (2), malformed report dict (1), NaN metric detectability (1), cross-schema MSE consistency (1) - 245 tests passing, risk register: 3 open concerns remain Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

test_evaluation_report.py imported pandas at module level, causing a collection error in CI where pandas is not installed (optional dependency via [dataframe] extra). Use pytest.importorskip to skip gracefully. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Polichinel and others added 3 commits April 4, 2026 02:07

Polichinel merged commit cf32c76 into development Apr 4, 2026
4 checks passed

Polichinel deleted the debug/cleanup03042026 branch April 4, 2026 00:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: close risk register concerns C-16, C-17, C-18#17

fix: close risk register concerns C-16, C-17, C-18#17
Polichinel merged 3 commits intodevelopmentfrom
debug/cleanup03042026

Polichinel commented Apr 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Polichinel commented Apr 4, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant