fix: close risk register concerns C-16, C-17, C-18#17
Merged
Polichinel merged 3 commits intodevelopmentfrom Apr 4, 2026
Merged
fix: close risk register concerns C-16, C-17, C-18#17Polichinel merged 3 commits intodevelopmentfrom
Polichinel merged 3 commits intodevelopmentfrom
Conversation
…ntext, step sentinel
Address C-16, C-17, C-18 identified by risk register review with TDD:
- C-16: Wrap metric function calls in _calculate_metrics() with try/except
that re-raises as ValueError naming the metric, task, and pred_type
- C-17: Replace hardcoded max_allowed_step=999 with float('inf') so steps
>= 1000 are not silently dropped
- C-18: Add bounds validation in resolve_metric_params() for alpha, quantile,
lower_quantile, upper_quantile — all must be in (0, 1). Cross-validation
for QIS lower_quantile < upper_quantile
Also: update CICs (MetricCatalog, NativeEvaluator) and ADRs (011, 014) with
Known Deviations sections documenting C-02 and C-05. Close C-14 (stale
editable install metadata). Upgrade C-02 from Tier 3 to Tier 2.
9 new tests, 240 total passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s-schema consistency, C-10 - Add object-dtype rejection to EvaluationFrame._validate() (ADR-011 Pure NumPy contract) - Remove 22 lines of dead pandas/object-dtype branches from _guard_shapes (closes C-10) - Add 5 new tests: object-dtype rejection (2), malformed report dict (1), NaN metric detectability (1), cross-schema MSE consistency (1) - 245 tests passing, risk register: 3 open concerns remain Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
test_evaluation_report.py imported pandas at module level, causing a collection error in CI where pandas is not installed (optional dependency via [dataframe] extra). Use pytest.importorskip to skip gracefully. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_calculate_metrics()with try/except that re-raises asValueErrornaming the metric, task, and pred_typemax_allowed_step=999withfloat('inf')so steps >= 1000 are not silently droppedresolve_metric_params()foralpha,quantile,lower_quantile,upper_quantile— all must be in (0, 1). Cross-validation for QIS quantile orderingAlso: update CICs (MetricCatalog, NativeEvaluator) and ADRs (011, 014) with Known Deviations sections documenting C-02 and C-05. Close C-14 (stale editable install metadata). Upgrade C-02 from Tier 3 to Tier 2. Risk register restructured (closed entries moved to Closed table).
Test plan
validate_docs.shpasses🤖 Generated with Claude Code