refactor: unify coords-as-truth handling in add_variables/add_constraints#732
Conversation
`add_variables` had two related bugs when `lower`/`upper` were arrays: - pandas Series/DataFrame bounds missing a dimension in `coords` had the missing dimension silently dropped (#709), unlike DataArray bounds which were already broadcast. - DataArray bounds missing a dimension were expanded with `DataArray.expand_dims`, which prepends new dimensions and produces a `coords`-mismatched dimension order in the resulting variable (#706). The order depended on the type of the bounds, so scalar bounds worked but two array bounds missing the same dimension did not. Replace `_validate_dataarray_bounds` plus the downstream `as_dataarray(..., coords)` call with a single helper `_as_dataarray_in_coords`. It converts any input (pandas with named axes via `to_xarray`, otherwise via `as_dataarray`), validates the result against `coords`, expands missing dims, transposes to coords order, and reconstructs the coord variables in that order. `expand_dims` and `transpose` are no-ops when the array already matches, so scalar / full-dim DataArray bounds keep their fast path. Also fix `linopy.piecewise._broadcast_points`, which built the `expand_dims` map from a `set`, producing a hash-randomized dimension order across processes. Iterate expressions and dims in declaration order instead. Closes #706 and #709. Supersedes #710 and #719. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restate #706/#709's fix as a single principle in the docstring, release note, and `_as_dataarray_in_coords` helper docstring: when `coords` is provided to `add_variables`, it is the source of truth for dimensions, dimension order, and coordinate values, and `lower` / `upper` are broadcast and aligned to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0.7.0 already shipped "add_variables no longer ignores coords when lower / upper are DataArrays". Recast the new bullet as extending that fix to the remaining gaps (pandas bounds; dim order across bound types) so the continuity is visible from the release notes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ng gaps" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Parametrize test_bound_broadcast_missing_dim with three additional cases: Series with MultiIndex(time, colour), DataFrame with MultiIndex columns(space, colour), and DataFrame with MultiIndex index(time, space). Exercises the `while DataFrame: unstack()` loop and the MultiIndex branch of `_named_pandas_to_dataarray`. - Add test_dataarray_coord_reorder for the same-values-different-order reindex branch (previously only the unequal-values raise was covered). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Relocate `_as_dataarray_in_coords` and its helpers (`_coords_to_dict`, `_named_pandas_to_dataarray`) from `model.py` into `common.py`, alongside the existing `as_dataarray` they parallel. Rename to `as_dataarray_in_coords` (no leading underscore) since it is no longer file-local — other modules can import the strict-coords variant when migrating call sites. Pure relocation: no behavior change, no call-site changes beyond `add_variables`'s import. Refs #723. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anches Replace the unstack-while-loop / split named-check structure with a single up-front "all axes named" check and a single ``DataFrame.stack(level=list(range(nlevels)), future_stack=True)`` call that collapses all column levels into the row MultiIndex in one shot. Same observable behaviour, fewer moving parts, no defensive unreachable branches. Add tests covering the unnamed-axis fall-through path, the empty-coords short-circuit in ``as_dataarray_in_coords``, and the ``MultiIndex``-on-a-dim ``continue`` in the validation loop. Together with the restructure these bring the new helper code to full patch coverage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pandas allows any hashable in ``pd.Index.names`` (tuples, ints, etc.), but only strings map cleanly to xarray dim names. Reject anything non-string up front so the pandas falls back to ``as_dataarray`` instead of producing a DataArray with an awkward non-string dim name that downstream validation would reject with a confusing "extra dimensions" error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Inputs without their own meaningful labels — numpy arrays, polars Series, pandas with unnamed axes — fell through ``as_dataarray_in_coords`` via a short-circuit return. That meant: - The default ``dim_0`` / ``dim_1`` axis names from ``as_dataarray`` leaked into the result, so a pandas Series without an index name combined with another bound carrying a named coord produced a spurious 2-D variable. - Shape mismatches surfaced further downstream as confusing "coordinates do not match" errors against the auto-generated ``RangeIndex``. The fall-through now: (a) defaults ``dims`` to coords' keys so axes get labelled correctly; (b) runs the same validate / expand / transpose path as labelled inputs; (c) re-assigns coords from ``expected`` on the resulting DataArray so positional inputs align to coords by position. A shape mismatch surfaces as xarray's clear ``conflicting sizes`` from ``assign_coords``. MultiIndex coords are left alone (re-assigning a PandasMultiIndex emits a FutureWarning). Replaces the tautological ``test_pandas_bound_with_unnamed_axis_falls_through`` (which sneaked past by naming the coord ``"dim_0"`` to match the auto-generated dim) with ``test_positional_bound_aligns_to_coords`` that asserts actual positional alignment across numpy / Series / DataFrame, plus ``test_positional_bound_wrong_size_raises_clear_error`` for the shape-mismatch path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o_dict ``reformulate_sos1`` / ``reformulate_sos2`` built the coords for the indicator variable as ``[var.coords[d] for d in var.dims]``, which is a list of ``xarray.DataArray`` coord objects. The rest of linopy passes ``coords`` as a list of ``pd.Index``. The mix slipped through under the old short-circuit fall-through but broke once the helper started defaulting ``dims`` from ``_coords_to_dict(coords)`` — non-``pd.Index`` entries were silently dropped, so ``len(dims) < len(coords)`` and xarray raised ``different number of dimensions on data and dims: 2 vs 1``. Use ``var.indexes[d]`` instead — it returns the actual ``pd.Index`` (regular or MultiIndex) for the dim and preserves structure that ``pd.Index(coord.values, ...)`` would flatten. Also widen ``_coords_to_dict`` to accept any entry with a ``.name`` (xarray DataArrays included) so a future caller passing mixed types doesn't silently lose coords. The reformulator fix removes the only known producer of mixed-type coords; this is belt-and-suspenders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the permissive ``getattr(c, "name", None)`` check with an explicit allow-list: ``pd.Index`` (named or not — unnamed silently skip as before) and unnamed sequences (``list`` / ``tuple`` / ``range`` / ``numpy.ndarray``). Any other type (notably ``xarray.DataArray``, but also ``pd.Series`` and friends) now raises ``TypeError`` with a hint to pass ``variable.indexes[<dim>]`` instead. This would have caught the SOS-reformulator bug at the source instead of letting it surface as a confusing xarray error about mismatched dim counts ten frames down. Drop ``DataArray`` from the matching ``coords`` type hints in ``model.py`` / ``expressions.py`` so the documented and runtime type sets agree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- _coords_to_dict: explicitly handle pd.MultiIndex — register under .name if set, raise TypeError with guidance if .name is missing - _named_pandas_to_dataarray: use DataArray(df) directly for single-level DataFrames; reserve stack() for MultiIndex axes - as_dataarray_in_coords: validate MultiIndex dims with .equals() instead of silently skipping them - Move MultiIndex tests into dedicated TestAddVariablesMultiIndexCoords class with shared fixture
…nts (#725) * fix(model): apply coords-as-truth rule to mask in add_variables/add_constraints Routes ``mask`` through ``as_dataarray_in_coords(mask, data.coords)`` instead of ``as_dataarray(...) + broadcast_mask(...)``, so pandas ``Series`` / ``DataFrame`` masks missing a dimension are broadcast to the variable / constraint shape (parallel to the bounds fix in the previous PR). The ``add_variables`` ``mask`` type hint widens to ``MaskLike`` to match ``add_constraints``. The deprecation announced via ``FutureWarning`` in ``broadcast_mask`` ("Missing values will be filled with False ... In a future version, this will raise an error") is now in effect: masks whose coordinates are a sparse subset of the data's coordinates raise ``ValueError`` instead of silently filling missing entries. Mask dims not in the data raise ``ValueError`` instead of ``AssertionError`` for consistency with the bounds path. ``broadcast_mask`` had no other callers and is removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update doc/release_notes.rst Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>
…on (#726) * fix(model): apply coords-as-truth rule to mask in add_variables/add_constraints Routes ``mask`` through ``as_dataarray_in_coords(mask, data.coords)`` instead of ``as_dataarray(...) + broadcast_mask(...)``, so pandas ``Series`` / ``DataFrame`` masks missing a dimension are broadcast to the variable / constraint shape (parallel to the bounds fix in the previous PR). The ``add_variables`` ``mask`` type hint widens to ``MaskLike`` to match ``add_constraints``. The deprecation announced via ``FutureWarning`` in ``broadcast_mask`` ("Missing values will be filled with False ... In a future version, this will raise an error") is now in effect: masks whose coordinates are a sparse subset of the data's coordinates raise ``ValueError`` instead of silently filling missing entries. Mask dims not in the data raise ``ValueError`` instead of ``AssertionError`` for consistency with the bounds path. ``broadcast_mask`` had no other callers and is removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: unify as_dataarray; split broadcasting from coords validation Closes #723. Folds the body of `as_dataarray_in_coords` into `as_dataarray` and extracts the contract checks into `assert_compatible_with_coords`, so linopy now has one broadcasting primitive and one validation companion. `as_dataarray(arr, coords)` aligns the result against `coords` for every input type: labels positional inputs (numpy / unnamed pandas / scalar) by position, reindexes same-values-different-order, expands missing dims, and transposes to coords order. Extra dims and disagreeing value sets on shared dims pass through unchanged, so xarray broadcasting in expression arithmetic keeps working. `assert_compatible_with_coords(arr, coords)` enforces the strict contract (`arr.dims ⊆ coords.dims`, plus exact coord-value equality on shared dims). `add_variables` and `add_constraints` now call it after `as_dataarray` for `lower` / `upper` / `mask`, replacing the deleted `as_dataarray_in_coords` helper. `_coords_to_dict` filters MultiIndex level coords out of `xarray.Coordinates` inputs so the new strict-by-default path treats `station` (and not its derived `letter` / `num` levels) as the dim. Test suite: 3698 passed (no regressions). Two existing tests were updated to reflect the new "coords is source of truth" semantics: `test_as_dataarray_with_ndarray_coords_dict_set_dims_not_aligned` (extra coord entries now broadcast in) and `test_dataarray_extra_dims` (now triggers the subset check rather than the value-mismatch check). Microbenchmark in dev-scripts/benchmark_as_dataarray.py shows flat timings vs the base branch on both add_variables-heavy and arithmetic- heavy workloads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: dims= names unnamed coords; doctest the add_variables contract Closes a silent-failure gap in the strict coords-as-truth path: when the caller passed ``coords=[[1, 2, 3]], dims=["x"]`` to ``add_variables``, ``_coords_to_dict`` returned an empty mapping (unnamed sequences carry no dim name), so the strict checks short-circuited and bounds with extra dims or mismatched values flowed through unchecked, producing variables with frankenstein outer-joined coord values. ``_coords_to_dict`` now accepts an optional ``dims`` argument that names unnamed sequence entries by position. ``as_dataarray`` and ``assert_compatible_with_coords`` plumb it through; ``add_variables`` forwards ``kwargs.get("dims")`` to the assertions for ``lower`` and ``upper``. ``coords=[[1, 2, 3]], dims=["x"]`` now enforces the same contract as ``coords={"x": [1, 2, 3]}`` or ``coords=[pd.Index([1, 2, 3], name="x")]``. Docstring of ``add_variables.coords`` documents the contract (subset-of-dims, dim order, value match with auto-reindex, missing-dim broadcast) and includes four doctests pinning it: the extra-dim raise, the value-mismatch raise, the same-values-different-order auto-reindex, and the unnamed-coords-plus-dims opt-in. Test suite: 3698 passed (parity with the previous commit on this branch). ``pytest --doctest-modules linopy/model.py -k add_variables`` also green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: add align_to_coords with semantic validation error messages Introduce align_to_coords to wrap as_dataarray and assert_compatible_with_coords with user-facing labels (lower bound, upper bound, mask). Errors now name the argument and distinguish extra dimensions, coordinate mismatches, and conversion failures. Extend mask validation to use coords+dims= when provided. Co-authored-by: Cursor <cursoragent@cursor.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor(model): simplify mask align; preserve TypeError in align_to_coords Three cleanups on top of align_to_coords: - Drop the trailing ``.broadcast_like(data.labels)`` in ``add_variables`` and ``add_constraints`` mask paths. ``as_dataarray`` already expands missing dims to ``coords`` shape, so the broadcast was a no-op. - Stop overriding the caller's ``dims=`` in the ``add_variables`` mask path when ``coords is None``. The previous code stripped ``dims`` and forced ``dims=data.dims``; with ``data.coords`` being an xarray ``Coordinates`` with already-named dims, the user's ``dims`` is harmless to forward and the override was just hiding intent. Mask now goes through one ``align_to_coords`` call regardless of whether ``coords`` is supplied. - Split the exception handler in ``align_to_coords``: ``TypeError`` from unsupported input types is re-raised as ``TypeError`` (still labeled), while ``ValueError`` / ``CoordinateValidationError`` stay ``ValueError``. Preserves the original type signature for callers that want to ``except TypeError``. New test ``test_align_to_coords_preserves_type_errors`` pins the TypeError pass-through. Suite: 3703 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: rename assert_compatible_with_coords to validate_alignment Per PR review: align on the project's `validate_*` naming convention and remove the implicit "AssertionError" connotation of `assert_*`. Pairs naturally with `align_to_coords`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
#729 made `.name` required on `pd.MultiIndex` sequence-form coord entries (xarray needs a single dim name for the flattened index). test_repr.py was the only remaining call site missing the assignment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`xarray.Coordinates.dims` is typed `Hashable`, so the dict-comprehension return and the `sorted()` calls in the validation message tripped mypy. The function's other branches already accept `c.name` / `dim_names[i]` (both Hashable), so widening the return type is the honest signature. Also: drop `.data` from the add_variables doctest — use the public `v.lower` property instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@FabianHofmann Although the breaking changes are quite small, id like to test this against pypsa and flixopt, to make sure they dont use any of the now raising paths. |
…#733) * refactor(common): clarify coords-entry rules and tighten error labels Stacks on top of #732. Three small follow-ups from PR review: - Remove dead `broadcast_mask` (claimed removed in #732, was still present). - `as_dataarray`: normalize bare-tuple coord entries to lists so `coords=[(0, 1, 2)]` behaves identically to `coords=[[0, 1, 2]]` (xarray reads `(a, b)` as `(dim_name, values)` and would otherwise raise a confusing error). - `align_to_coords`: pre-validate coords via `_coords_to_dict` so TypeErrors from a bad `coords` argument propagate with their own message instead of being relabeled "<label> could not be aligned to coords: ...", which previously misdirected users to inspect the bound/mask. Docs: replace the prose paragraph in `_coords_to_dict`'s docstring with an explicit rules table covering every container form and sequence-entry case (named/unnamed `pd.Index`, `pd.MultiIndex`, bare sequences, with/without positional `dims=`). Tests: new `TestCoordsToDictRules` class in `test_common.py` mirrors the docstring table one-test-per-rule so the executable spec stays visibly aligned with the documented contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(common): allow dims= to name an unnamed pd.MultiIndex Mirrors the existing rule for unnamed pd.Index: an unnamed MultiIndex paired with a positional dims=[i] entry now gets its flat .name set to dims[i] on a shallow copy (caller's MultiIndex is not mutated). Per-level names are preserved. Removes the asymmetry between Index and MultiIndex in _coords_to_dict: both can now be named either inline (.name) or by position (dims=[i]). An unnamed MultiIndex with no positional dims still raises TypeError since xarray requires a single flat name. Adds one rule-table row and two tests (test_unnamed_multiindex_with_dims_uses_dims, test_unnamed_multiindex_without_dims_raises). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(common): scope tuple-normalize check to lists/tuples with tuple entries The previous `not isinstance(coords, Coordinates | Mapping)` form was broad and rebuilt `coords` as a fresh list on every call (even when no tuple entries were present). Switch to a positive `isinstance(coords, list | tuple)` guard with a short-circuit `any(isinstance(c, tuple) for c in coords)` check, so the comprehension only runs when there is actually a tuple to normalize. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anges Two pre-existing Upcoming-Version bullets from master had been dropped on this branch, most likely as merge-conflict casualties: - ``LinearExpression.where`` doc + ``BaseExpression.variable_names`` entry - Mosek basic/IPM solution-inspect fix Restore both verbatim from master. Also add an explicit Breaking Changes bullet for the coord-as-truth behaviour changes that previously lived only under Bug Fixes: the mask FutureWarning -> ValueError flip, the AssertionError -> ValueError flip on extra mask dims, and the new TypeError on an unnamed pd.MultiIndex without a positional dims=[i] entry. The Bug Fixes entries still carry the migration detail; the Breaking Changes bullet points there so readers scanning by section don't miss the rename of warnings to errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Merge the two Bug Fixes bullets (bounds + mask) into one. The separation read as "same fix, applied twice" without adding info; one bullet covers both with the same migration detail. - Shorten the Breaking Changes bullet — it duplicated the v0.6.3 ``FutureWarning`` and ``AssertionError`` parentheticals already in Bug Fixes; keep only the FutureWarning summary and the pd.MultiIndex addition. - Collapse the Internal as_dataarray bullet from 8 wrapped lines to one, and drop the "Validation errors name the argument" UX detail — accurate but not structural enough for a release note. No facts removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@FabianHofmann This is the most relevant PR now i think, to continue working on the others |
|
reviewing now, we should be able to pull this in today |
|
@FabianHofmann I tested this again with flixopt and im getting 0 failed tests |
for more information, see https://pre-commit.ci
|
@FabianHofmann
|
|
Per review feedback on #717, the convention now names its object scope: operations between a linopy object and any non-linopy operand behave exactly like operations against the constant-only expression holding the same values and coordinates, for every operator and operand position. The alignment group resolves the previously open question on unlabeled data: unlabeled operands (numpy arrays, lists, polars Series) pair with dimensions by size, and the pairing must be unambiguous — a length-4 array against dims (a: 4, b: 4) raises instead of silently picking the leading dim. Marked TODO: implementation builds on the coords-as-truth seam from #732 and lands after it. Slice H tests pin the substitutability that already holds: per-operator raw-vs-wrapped equivalence, distributivity/associativity across mixed operand types, identical §8 raises on either route, and the type-decided divisor exception. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@FabianHofmann I think this surfaces a design weakness. I propose splitting the conversion logic once more. Ill do a quick PR for it. |
The function never aligns anything — it broadcasts and raises on any mismatch it cannot resolve by broadcasting alone. "Align" is also the word that invites join= proposals (aligns take joins, broadcasts do not), so the name now states what it is: the same broadcast as broadcast_to_coords with a strict failure mode (zip(strict=True) semantics). Error messages keep the "could not be aligned to coords" wording so tests in the base branch (#732) stay untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…strict=...) Per review discussion: one public function instead of two, with strict as a keyword flag. - strict=True (default): any mismatch with coords raises, naming label in the error — the former strict_broadcast_to_coords. - strict=False: mismatches pass through for downstream xarray alignment — the former loose broadcast_to_coords, used by arithmetic. Strict is the default so that forgetting the flag adds safety rather than silently dropping validation. MI handling preserved exactly per mode (strict: silent partial / raise on gap; non-strict: EvolvingAPIWarning) — the scenario-B deprecation warnings land separately in #732. Call sites: model.py bounds/mask drop the long name (strict is default); arithmetic and as_expression pass strict=False explicitly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… ladder (#737) * refactor(common): split DataArray conversion into a 3-rung strictness ladder Replace the as_dataarray + _as_dataarray_lax pair (and the enforce_level_coverage flag) with three public entry points, each including the previous one: - as_dataarray: convert only (the former _as_dataarray_lax). Used by __matmul__, where dims missing from the constant must not be broadcast in (they would be contracted away as common dims). - broadcast_to_coords: convert + broadcast against coords (the former broadcasting as_dataarray). Used by expression arithmetic. - align_to_coords: convert + broadcast + enforce the coords contract. Used by add_variables / add_constraints (unchanged signature). The broadcasting mechanics live in one shared private core (_broadcast_core) that reports MultiIndex-level projections instead of applying policy. The entry points decide what a partial projection or coverage gap means: broadcast_to_coords warns (arithmetic convention), align_to_coords raises (coords contract). This removes the enforce_level_coverage flag and keeps validation concerns out of the broadcasting layer. No behavior changes; all call sites keep their semantics. New tests pin the ladder contrasts and the matmul dim-contraction rules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: shorten release-notes bullet on conversion helpers Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(common): rename _broadcast_core to _broadcast_to_coords Private-twin convention: _broadcast_to_coords is the raw implementation of broadcast_to_coords (returns projection events instead of applying policy), shared with align_to_coords. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(expressions): as_expression converts constants with the broadcast rung The constraint lhs/rhs setters call as_expression(value, model, coords=self.coords, dims=self.coord_dims); forwarding those kwargs to the convert-only as_dataarray dropped the broadcasting these setters relied on (e.g. a MultiIndex-level-indexed rhs failed with an xarray AlignmentError instead of being projected onto the stacked dim). Use broadcast_to_coords instead. The other as_expression callers pass only dims (no coords), for which both rungs behave identically. Adds regression tests for the rhs setter: missing-dim broadcast and MultiIndex-level projection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(common): rename align_to_coords to strict_broadcast_to_coords The function never aligns anything — it broadcasts and raises on any mismatch it cannot resolve by broadcasting alone. "Align" is also the word that invites join= proposals (aligns take joins, broadcasts do not), so the name now states what it is: the same broadcast as broadcast_to_coords with a strict failure mode (zip(strict=True) semantics). Error messages keep the "could not be aligned to coords" wording so tests in the base branch (#732) stay untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(common): apply review polish to the strictness ladder - Document the one non-obvious policy in strict_broadcast_to_coords: partial-level broadcasts are silent (bounds-broadcast feature), unlike the warning on the broadcast rung. - Unify the first parameter name across the ladder (value -> arr). - Un-invert the warning-policy loop in broadcast_to_coords. - Rename the test whose name forced an awkward signature wrap to a behavior-oriented name (test_extra_dims_pass_broadcast_rung_fail_strict_rung). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(common): add numpydoc Parameters/Returns to the three public rungs Parameter entries carry descriptions only — types live in the function signatures. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(common): unify the broadcast rungs into broadcast_to_coords(strict=...) Per review discussion: one public function instead of two, with strict as a keyword flag. - strict=True (default): any mismatch with coords raises, naming label in the error — the former strict_broadcast_to_coords. - strict=False: mismatches pass through for downstream xarray alignment — the former loose broadcast_to_coords, used by arithmetic. Strict is the default so that forgetting the flag adds safety rather than silently dropping validation. MI handling preserved exactly per mode (strict: silent partial / raise on gap; non-strict: EvolvingAPIWarning) — the scenario-B deprecation warnings land separately in #732. Call sites: model.py bounds/mask drop the long name (strict is default); arithmetic and as_expression pass strict=False explicitly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(common): require label when broadcast_to_coords is strict Restores the contract align_to_coords always had: strict-mode errors must name their subject ("lower bound could not be aligned..." rather than "Value could not be aligned..."). Enforced both statically (overloads: strict=True requires label: str, strict=False forbids it) and at runtime (TypeError). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(common): deprecate implicit MI-level projection everywhere (scenario B) Per the #737 review discussion and Fabian's decision: implicit level projection is deprecated and will raise under the v1 convention, so the EvolvingAPIWarning now fires in both modes of broadcast_to_coords — the MI check is the same for every use case: - input missing a whole level: warn (strict and non-strict) - coverage gap (level combinations without a value): warn (non-strict) / raise (strict — no downstream layer to defer the NaN to) Warning emission lives in one helper, _warn_implicit_projections, with a TODO(#738) to migrate to LinopySemanticsWarning once #717 lands. Also clarifies the MultiIndex terminology everywhere: an MI dim has *levels* and *level combinations* (one tuple per position). Docstrings carry the glossary, the coverage-gap error names the missing combinations explicitly, and "entry" is gone from messages. User-facing: add_variables / add_constraints with per-period-style bounds now emit the deprecation warning (PyPSA multi-investment). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ray-coords breaking change - The implicit MultiIndex-level projection deprecation (scenario B) now has its own entry under Deprecations, where PyPSA users scanning for upcoming warnings will find it. - Breaking Changes gains the CoordsLike narrowing: DataArray entries in sequence-form coords raise TypeError (pass variable.indexes[dim] instead). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Note Claude Code review finding — for the MI follow-up, not blocking this PR (the hole pre-dates the #737 merge). Known hole: unnamed-MultiIndex-level inputs bypass strict validationA bound indexed by a MultiIndex with unnamed levels passes idx = pd.MultiIndex.from_product([[2020, 2030], ['t1', 't2']], names=('period', 'timestep'))
idx.name = 'snapshot'
coords = xr.Coordinates.from_pandas_multiindex(idx, 'snapshot')
# unnamed levels — note: no names=(...)
sparse_unnamed = pd.Series({(2020,'t1'): 1.0, (2030,'t2'): 2.0})
broadcast_to_coords(sparse_unnamed, coords, dims=['snapshot'], label='lower bound')
# → returns a 2-entry result, NO error — should raise (entries don't match the 4-entry coords)With named levels the same input correctly raises: Cause: the unnamed-level MI can't be matched to the coords' levels by name, so no projection happens; the input keeps its own 2-entry MultiIndex under the Suggested fix direction (for the MI follow-up): in Real users (PyPSA) always have named levels, so low practical impact — but "strict" shouldn't have a silent bypass. |
|
@FabianHofmann can you look into the discovered MI gap? |
|
yes, running pypsa CI now and report back |
|
CI on pypsa runs through, great. just pushing a fix on MIs |
validate_alignment unwrapped only bare pd.MultiIndex coord entries, so Coordinates-backed (DataArray) MI dims read as non-MI and skipped the equality check. Use _as_multiindex on both sides to catch mismatches regardless of level names.
|
@FBumann let's merge this. I would suggest we make a follow up PR only doing some refactoring:
|
…ts (#742) * refactor: move conversion/broadcasting/alignment code to linopy/alignment.py Pure move, no behavior change. The new module owns the seam between user input and linopy's labelled arrays: - coords parsing: _coords_to_dict, _as_index, _as_multiindex - conversion: get_from_iterable, pandas_to_dataarray, numpy_to_dataarray, _named_pandas_to_dataarray, fill_missing_coords, as_dataarray - MultiIndex projection: _LevelProjection, _project_onto_multiindex_levels, _warn_implicit_projections - broadcasting: _broadcast_to_coords, broadcast_to_coords - validation: validate_alignment - symmetric alignment: align common.py keeps the general utilities (formatting, label indexes, polars helpers, decorators). Importers (model, expressions, variables, __init__, tests) updated; no re-exports. Follow-up requested by @FabianHofmann in #732. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: restructure alignment tests into test_alignment.py One class per concept in linopy.alignment, mirroring the module's public surface: - TestAsDataarrayFrom{Pandas,Numpy,Scalar,DataArray} + MultiIndexCoords - TestCoordsToDict (the coords-entry naming rules) - TestBroadcastToCoords (strict=False mechanics) - TestMultiIndexProjection (projection values, deprecation warnings, coverage gaps — the legacy/v1 fork point for #717) - TestStrictMode (strict=True contract) - TestValidateAlignment - TestAlign Shared fixtures (mi_index / mi_coords / by_level1) replace the repeated MultiIndex setup; the pandas dims-naming and numpy labeling tests are consolidated into parametrized tables. test_common.py keeps the utility tests. Full suite count unchanged (3202) — no coverage lost. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: close coverage gaps in alignment.py and the MI coords serialization alignment.py: 97% -> 99%. New edge-case tests: bare-string dims, 0-d arrays, fill_missing_coords type check, partially-named MI levels, gap detection with extra dims, gap-error truncation (>5 missing combinations). The two remaining uncovered lines are defensive branches for inputs outside the DimsLike contract (non-iterable dims). common.py: 88% -> 90%. The MultiIndex round-trip through coords_to_dataset_vars / coords_from_dataset (used by CSRConstraint) had zero coverage; now pinned. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Update test/test_alignment.py Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> * Update test/test_alignment.py Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>
Brings in the coords-as-truth stack (#732), the alignment.py module split (#742), has_terms (#743), and the MatrixAccessor caching (#716). Conflict resolutions (consistent rule: keep this branch's v1/legacy dispatch structure, use master's conversion calls inside it): - expressions.py: _add_constant, _apply_constant_op_{v1,legacy}, and to_constraint use broadcast_to_coords(..., strict=False) instead of as_dataarray; SUPPORTED_CONSTANT_TYPES -> CONSTANT_TYPES. - variables.py: to_linexpr converts via broadcast_to_coords(strict=False), then applies the v1/legacy absence handling. - __init__.py: align from linopy.alignment + LinopySemanticsWarning export. Test adaptations for post-#732 APIs and semantics: - test_legacy_violations: name the MultiIndex coords entry (required since #732). - test_linear_expression: the masked-addend tails of test_nterm and test_variable_names pin legacy absence behavior; split into @pytest.mark.legacy / @pytest.mark.v1 pairs (section 6 divergence). Full suite under both semantics: 6446 passed, 514 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cenario B v1) Closes the two integration points between the alignment layer (#732/#742) and the v1 semantics infrastructure: - _warn_implicit_projections -> _enforce_implicit_projections: under legacy semantics, the implicit MultiIndex-level projection deprecation now goes through warn_legacy() / LinopySemanticsWarning (#738, replacing EvolvingAPIWarning, which stays piecewise-only); under the v1 convention it raises ValueError (sections 8 and 11) — the projection must be written explicitly (scenario B of the #732/#737 discussion). - as_expression no longer swallows the underlying conversion error: "Cannot convert to LinearExpression: <original message>" so the v1 guidance reaches the user. Tests: the MI-projection deprecation tests in test_alignment, test_variable, test_constraint, and test_linear_expression are marked @pytest.mark.legacy and assert LinopySemanticsWarning; each gains a @pytest.mark.v1 counterpart asserting the v1 raise. Full suite under both semantics: 6446 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The convention was silent on inputs indexed by levels of a stacked MultiIndex dimension — the question resolved as scenario B in the #732/#737 discussion. Now written into section 11: - level coords are auxiliary coordinates, so a level-named operand dim is a section-11 conflict: it raises, with the explicit .sel() projection as the documented recipe; - a full reconstruction of the MultiIndex is not a conflict (same coordinate spelled differently, aligns under section 8); - legacy projects implicitly and warns; the projection is removed at 1.0 (added to legacy-removal.md). Also: the #736 TODO no longer claims #732 is unmerged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes #706, #709, #723.
What changes
This branch makes
coordsthe single source of truth for every shape-bearing argument ofadd_variablesandadd_constraints(lower,upper,mask), adds MultiIndex-level projection with a deprecation path, and rebuilds the conversion layer inlinopy.commonaround two public functions. The work landed over several stacked sub-PRs (#722, #725, #726, #729, #733, #737); this umbrella PR consolidates them for a single review againstmaster.1. Bounds —
add_variables0.7.0 (#614) made
coordsthe source of truth forDataArraybounds. This branch closes the remaining gaps so the rule holds for every bound type and dimension order:Series/DataFramebounds missing a dimension incoordsare now broadcast tocoordsinstead of being silently dropped (add_variables: pandas Series/DataFrame bounds with missing dimensions silently drop the dimension #709).coordsregardless of bound type (add_variables: DataArray bounds with missing dimensions yield wrong dimension order #706).DataArraybounds with an extra dim or mismatched coord values raiseValueErrorwith a labelled message ("lower bound: …").add_variables' docstring now documents the coords-as-truth contract with doctests pinning the error messages.2. Mask —
add_variablesandadd_constraintsThe same coords-as-truth rule applies to
mask. TheFutureWarningthatbroadcast_maskhad been emitting since v0.6.3 (#580) is now in effect:False+FutureWarningValueErrormask.reindex({...}, fill_value=False)AssertionErrorValueErrorpd.Series/pd.DataFramemask missing a dim was sometimes silently droppedbroadcast_maskis removed.3. MultiIndex-level projection (deprecated on arrival)
Inputs indexed by levels of a stacked-
MultiIndexcoordsdimension are projected onto that dimension: a level subset broadcasts across the others, the full set aligns element-wise. This fixes PyPSA multi-investment arithmetic (an expression over a(period, timestep)snapshotMultiIndex times aperiod-indexed weighting).Terminology: a stacked MultiIndex dim has levels (component index names, e.g.
period/timestep) and level combinations (its elements — one tuple per position, e.g.(2030, 't1')).Decided in the #737 thread (scenario B): implicit projections are deprecated everywhere — they emit
EvolvingAPIWarningin arithmetic and inadd_variables/add_constraints, and will raise under the v1 convention (#717, §8/§11). Coverage gaps (level combinations without a value) raiseValueErrorinadd_variables/add_constraints, with the missing combinations listed in the error.4. The conversion layer —
linopy.commonThe old
as_dataarray/as_dataarray_in_coordspair is replaced by two public functions (developed and reviewed in #737):strict=True(default): any mismatch withcoordsraises, naminglabelin the error.labelis required in this mode (enforced via overloads and at runtime). Used byadd_variables/add_constraints.strict=False: mismatches pass through for downstream xarray alignment (the operator'sjoin=owns reconciliation). Used by expression arithmetic andas_expression.as_dataarray(convert only) is used by__matmul__, where broadcasting would wrongly contract dims:a=(time,name) @ b=(name,location)must give(time,location), not(location).Mechanics and policy are separated: a private
_broadcast_to_coordsreports MultiIndex projections (_LevelProjection); the public function applies warn/raise policy per mode. Deprecation warnings live in one helper with aTODO(#738)to route through #717'swarn_legacy()once it lands.5. Coords-entry rules —
_coords_to_dictThe canonical contract for how each
coordsform / entry is named. Docstring of_coords_to_dictcarries the table;TestCoordsToDictRulesmirrors it one-test-per-row.pd.Indexwith.name.namedims[i]dims[i]dim_0)pd.MultiIndexwith.name.namepd.MultiIndexwithout.namedims[i]pd.MultiIndexwithout.name, nodims[i]TypeErrorDataArray)TypeError6. Piecewise, SOS, and small fixes
linopy.piecewise._broadcast_pointsiterates expressions and dims in declaration order (was hash-randomized).var.indexes[d]for reformulated bounds.Breaking changes
Compared to v0.7.0:
ValueError(wasFutureWarning+ silent fill, deprecated since v0.6.3).ValueError(wasAssertionError).MultiIndexpassed as a sequence-form coord entry must have.nameset or be paired withdims=[i](TypeErrorotherwise).DataArrays (TypeError; passvariable.indexes[dim]instead).EvolvingAPIWarning) and will raise under v1. Affects PyPSA multi-investment.Known limitations (follow-ups)
reindexonto a stacked MI) is broken upstream (Cannot reindex onto a stacked MultiIndex via indexers — only reindex_like works pydata/xarray#11368).LinopySemanticsWarningis tracked in Unify deprecation warning channels before v1: EvolvingAPIWarning vs LinopySemanticsWarning #738.Sub-PRs merged into this branch
maskas_dataarray; split broadcasting from validationdims=parity, tuple normalizationbroadcast_to_coords(strict=…, label=…), scenario-B deprecation, MI terminologyTest plan
TestCoordsToDictRules— one test per row of the coords-entry rules table.test_common.py— strictness-contrast tests, MI projection/warning/gap tests, mandatory-label contract.test_variable.py::TestAddVariablesBoundsWithCoords/TestAddVariablesMultiIndexCoords.test_constraint.py— constraintrhssetter regressions (missing-dim broadcast, MI-level projection).test_linear_expression.py— matmul dim-contraction tests.🤖 Generated with Claude Code