Skip to content

fix: close risk register concerns C-16, C-17, C-18#17

Merged
Polichinel merged 3 commits intodevelopmentfrom
debug/cleanup03042026
Apr 4, 2026
Merged

fix: close risk register concerns C-16, C-17, C-18#17
Polichinel merged 3 commits intodevelopmentfrom
debug/cleanup03042026

Conversation

@Polichinel
Copy link
Copy Markdown
Collaborator

Summary

  • C-16: Wrap metric function calls in _calculate_metrics() with try/except that re-raises as ValueError naming the metric, task, and pred_type
  • C-17: Replace hardcoded max_allowed_step=999 with float('inf') so steps >= 1000 are not silently dropped
  • C-18: Add bounds validation in resolve_metric_params() for alpha, quantile, lower_quantile, upper_quantile — all must be in (0, 1). Cross-validation for QIS quantile ordering

Also: update CICs (MetricCatalog, NativeEvaluator) and ADRs (011, 014) with Known Deviations sections documenting C-02 and C-05. Close C-14 (stale editable install metadata). Upgrade C-02 from Tier 3 to Tier 2. Risk register restructured (closed entries moved to Closed table).

Test plan

  • 9 new TDD tests (wrote failing first, then fixed)
  • 240 total tests passing
  • 0 lint errors (ruff)
  • validate_docs.sh passes
  • review-diff: CLEAN (0 critical, 0 warning after fixes)

🤖 Generated with Claude Code

Polichinel and others added 3 commits April 4, 2026 02:07
…ntext, step sentinel

Address C-16, C-17, C-18 identified by risk register review with TDD:

- C-16: Wrap metric function calls in _calculate_metrics() with try/except
  that re-raises as ValueError naming the metric, task, and pred_type
- C-17: Replace hardcoded max_allowed_step=999 with float('inf') so steps
  >= 1000 are not silently dropped
- C-18: Add bounds validation in resolve_metric_params() for alpha, quantile,
  lower_quantile, upper_quantile — all must be in (0, 1). Cross-validation
  for QIS lower_quantile < upper_quantile

Also: update CICs (MetricCatalog, NativeEvaluator) and ADRs (011, 014) with
Known Deviations sections documenting C-02 and C-05. Close C-14 (stale
editable install metadata). Upgrade C-02 from Tier 3 to Tier 2.

9 new tests, 240 total passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s-schema consistency, C-10

- Add object-dtype rejection to EvaluationFrame._validate() (ADR-011 Pure NumPy contract)
- Remove 22 lines of dead pandas/object-dtype branches from _guard_shapes (closes C-10)
- Add 5 new tests: object-dtype rejection (2), malformed report dict (1),
  NaN metric detectability (1), cross-schema MSE consistency (1)
- 245 tests passing, risk register: 3 open concerns remain

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
test_evaluation_report.py imported pandas at module level, causing
a collection error in CI where pandas is not installed (optional
dependency via [dataframe] extra). Use pytest.importorskip to skip
gracefully.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Polichinel Polichinel merged commit cf32c76 into development Apr 4, 2026
4 checks passed
@Polichinel Polichinel deleted the debug/cleanup03042026 branch April 4, 2026 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant