chore: hypothesis property tests for fill-value handling#990
Open
maxrjones wants to merge 2 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR adds
Property-based test coverage for the
fill-sentinel-valuesissue cluster, plus a zarr-spec compliance test that asserts virtualizarr's correctness independently of xarray's HDF5-style_FillValueencoding.Three test modules under
virtualizarr/tests/test_parsers/:test_zarr_spec_compliance.py— primary correctness check. AssertsZarrParserproduces metadata matchingzarr-python's view of the source, no xarray in the loop. 7/7 passing.test_hdf/test_hdf_fill_value_equivalence.py— HDFParser interop check againstxarray+h5netcdf. Hypothesis-driven dtype × fill matrix + curated@examples.test_zarr_fill_value_equivalence.py— ZarrParser interop check againstxr.open_zarr.Plus
_fill_value_common.py— shared infrastructure (sentinels, hypothesis profiles, two-layer assertion helpers with attribution).Hypothesis is added to the
devdependency group with>=6.100. Module-levelpytestmark = pytest.mark.hypothesis_testsallows skipping viapytest -m "not hypothesis_tests".VIRTUALIZARR_HYPOTHESIS_PROFILE=cishrinks the random-draw count for fast CI runs.What this PR doesn't change
No production code. No parser fixes. This is test infrastructure only. Failing tests are the to-fix list, tracked in https://github.com/NASA-IMPACT/veda-odd#371.
Why it matters
Test failures are attributed by category, so reviewers and maintainers can tell what to fix and where:
BothEnginesFailedIdenticallyErrorFillValueCoder, tracked at pydata/xarray#11332), not virtualizarrobserved (virtualizarr) failed; reference succeededassert_identicaldiffThis separates virtualizarr-side bugs from upstream xarray gaps cleanly. For example, #989's
_FillValue=0.0-as-JSON-number case shows up asBothEnginesFailedIdenticallyErrorin the equivalence module — and the spec-compliance module passes on the same fixture, proving the gap is upstream (pydata/xarray#11332).Current results
The 10 failures decompose as:
TestCompoundDtype— parser-sideTypeErrorathdf.py:364(new finding; tracked in roadmap as Cluster A).TestBasicEquivalence::test_equivalence_curated[*](HDF) — string-dtype_FillValue. Improved attribution surfaced this is a parser-side encode error, not the downstream decode error previously documented.TestBasicEquivalence::test_equivalence_random(HDF) — random draws hitting the same string-dtype cluster.TestBasicEquivalence::test_equivalence_curated[float64-nan-fill](Zarr) — upstream xarray decode gap, pydata/xarray#11332 (also user-reported as FillValue decoding unexpected fill_value: null (on version 2.6.1) #989).TestBasicEquivalence::test_equivalence_random(Zarr) — random draws hitting the same xarray decode gap.Each failure points at one of the clusters in the roadmap.
Notable findings surfaced while building this
fill_value) opens cleanly through both engines. Worth a separate verification but suggests it was silently fixed._FillValueis virtualizarr-specific, not upstream. The improved attribution showed the failure is an encode error at parser manifest construction, not a decode error in xarray.How to run
Refs
fix/problem_fillvalues) — partial Phase 1 fix (currently not in this branch's history; this PR sits onmain).Acceptance criteria:
docs/releases.md*.mdfile underdocs/api