Skip to content

refactor: unify coords-as-truth handling in add_variables/add_constraints#732

Merged
FabianHofmann merged 30 commits into
masterfrom
fix/bounds-coords-broadcast
Jun 2, 2026
Merged

refactor: unify coords-as-truth handling in add_variables/add_constraints#732
FabianHofmann merged 30 commits into
masterfrom
fix/bounds-coords-broadcast

Conversation

@FBumann
Copy link
Copy Markdown
Collaborator

@FBumann FBumann commented May 27, 2026

Closes #706, #709, #723.

What changes

This branch makes coords the single source of truth for every shape-bearing argument of add_variables and add_constraints (lower, upper, mask), adds MultiIndex-level projection with a deprecation path, and rebuilds the conversion layer in linopy.common around two public functions. The work landed over several stacked sub-PRs (#722, #725, #726, #729, #733, #737); this umbrella PR consolidates them for a single review against master.

1. Bounds — add_variables

0.7.0 (#614) made coords the source of truth for DataArray bounds. This branch closes the remaining gaps so the rule holds for every bound type and dimension order:

add_variables' docstring now documents the coords-as-truth contract with doctests pinning the error messages.

2. Mask — add_variables and add_constraints

The same coords-as-truth rule applies to mask. The FutureWarning that broadcast_mask had been emitting since v0.6.3 (#580) is now in effect:

Before (master) After this PR Migration
Sparse-coord mask silently fills missing entries with False + FutureWarning Raises ValueError mask.reindex({...}, fill_value=False)
Mask with extra dim raises AssertionError Raises ValueError Same fix; exception type changed
pd.Series / pd.DataFrame mask missing a dim was sometimes silently dropped Broadcast like bounds None needed

broadcast_mask is removed.

3. MultiIndex-level projection (deprecated on arrival)

Inputs indexed by levels of a stacked-MultiIndex coords dimension are projected onto that dimension: a level subset broadcasts across the others, the full set aligns element-wise. This fixes PyPSA multi-investment arithmetic (an expression over a (period, timestep) snapshot MultiIndex times a period-indexed weighting).

Terminology: a stacked MultiIndex dim has levels (component index names, e.g. period/timestep) and level combinations (its elements — one tuple per position, e.g. (2030, 't1')).

Decided in the #737 thread (scenario B): implicit projections are deprecated everywhere — they emit EvolvingAPIWarning in arithmetic and in add_variables/add_constraints, and will raise under the v1 convention (#717, §8/§11). Coverage gaps (level combinations without a value) raise ValueError in add_variables/add_constraints, with the missing combinations listed in the error.

4. The conversion layer — linopy.common

The old as_dataarray/as_dataarray_in_coords pair is replaced by two public functions (developed and reviewed in #737):

as_dataarray(arr, coords, dims)                    convert: type dispatch + positional labeling
broadcast_to_coords(arr, coords, dims, *,          + broadcast: project MI levels, reorder,
                    strict=True, label=...)          expand missing dims, transpose to coords order
  • strict=True (default): any mismatch with coords raises, naming label in the error. label is required in this mode (enforced via overloads and at runtime). Used by add_variables / add_constraints.
  • strict=False: mismatches pass through for downstream xarray alignment (the operator's join= owns reconciliation). Used by expression arithmetic and as_expression.
  • as_dataarray (convert only) is used by __matmul__, where broadcasting would wrongly contract dims: a=(time,name) @ b=(name,location) must give (time,location), not (location).

Mechanics and policy are separated: a private _broadcast_to_coords reports MultiIndex projections (_LevelProjection); the public function applies warn/raise policy per mode. Deprecation warnings live in one helper with a TODO(#738) to route through #717's warn_legacy() once it lands.

5. Coords-entry rules — _coords_to_dict

The canonical contract for how each coords form / entry is named. Docstring of _coords_to_dict carries the table; TestCoordsToDictRules mirrors it one-test-per-row.

Entry Naming source Outcome
pd.Index with .name .name accepted
unlabeled entry dims[i] accepted
unlabeled entry, no dims[i] skipped (xarray assigns dim_0)
pd.MultiIndex with .name .name accepted
pd.MultiIndex without .name dims[i] accepted (named on a copy)
pd.MultiIndex without .name, no dims[i] TypeError
anything else (e.g. DataArray) TypeError

6. Piecewise, SOS, and small fixes

  • linopy.piecewise._broadcast_points iterates expressions and dims in declaration order (was hash-randomized).
  • SOS reformulation uses var.indexes[d] for reformulated bounds.
  • Positional inputs aligned to coords with clear shape errors.

Breaking changes

Compared to v0.7.0:

  • Sparse-coord mask raises ValueError (was FutureWarning + silent fill, deprecated since v0.6.3).
  • Mask with extra dims raises ValueError (was AssertionError).
  • MultiIndex passed as a sequence-form coord entry must have .name set or be paired with dims=[i] (TypeError otherwise).
  • Sequence-form coords entries can no longer be DataArrays (TypeError; pass variable.indexes[dim] instead).
  • New deprecation: implicit MultiIndex-level projection warns everywhere (EvolvingAPIWarning) and will raise under v1. Affects PyPSA multi-investment.

Known limitations (follow-ups)

Sub-PRs merged into this branch

Test plan

  • Full suite: 3201 passed, 36 skipped (excluding solver/network tests). mypy + pre-commit clean. Doctests pass.
  • TestCoordsToDictRules — one test per row of the coords-entry rules table.
  • test_common.py — strictness-contrast tests, MI projection/warning/gap tests, mandatory-label contract.
  • test_variable.py::TestAddVariablesBoundsWithCoords / TestAddVariablesMultiIndexCoords.
  • test_constraint.py — constraint rhs setter regressions (missing-dim broadcast, MI-level projection).
  • test_linear_expression.py — matmul dim-contraction tests.

🤖 Generated with Claude Code

FBumann and others added 16 commits May 24, 2026 00:38
`add_variables` had two related bugs when `lower`/`upper` were arrays:

- pandas Series/DataFrame bounds missing a dimension in `coords` had
  the missing dimension silently dropped (#709), unlike DataArray
  bounds which were already broadcast.
- DataArray bounds missing a dimension were expanded with
  `DataArray.expand_dims`, which prepends new dimensions and produces
  a `coords`-mismatched dimension order in the resulting variable
  (#706). The order depended on the type of the bounds, so scalar
  bounds worked but two array bounds missing the same dimension did
  not.

Replace `_validate_dataarray_bounds` plus the downstream
`as_dataarray(..., coords)` call with a single helper
`_as_dataarray_in_coords`. It converts any input (pandas with named
axes via `to_xarray`, otherwise via `as_dataarray`), validates the
result against `coords`, expands missing dims, transposes to coords
order, and reconstructs the coord variables in that order.
`expand_dims` and `transpose` are no-ops when the array already
matches, so scalar / full-dim DataArray bounds keep their fast path.

Also fix `linopy.piecewise._broadcast_points`, which built the
`expand_dims` map from a `set`, producing a hash-randomized dimension
order across processes. Iterate expressions and dims in declaration
order instead.

Closes #706 and #709. Supersedes #710 and #719.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restate #706/#709's fix as a single principle in the docstring,
release note, and `_as_dataarray_in_coords` helper docstring:
when `coords` is provided to `add_variables`, it is the source of
truth for dimensions, dimension order, and coordinate values, and
`lower` / `upper` are broadcast and aligned to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0.7.0 already shipped "add_variables no longer ignores coords when
lower / upper are DataArrays". Recast the new bullet as extending
that fix to the remaining gaps (pandas bounds; dim order across
bound types) so the continuity is visible from the release notes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ng gaps"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Parametrize test_bound_broadcast_missing_dim with three additional
  cases: Series with MultiIndex(time, colour), DataFrame with
  MultiIndex columns(space, colour), and DataFrame with MultiIndex
  index(time, space). Exercises the `while DataFrame: unstack()`
  loop and the MultiIndex branch of `_named_pandas_to_dataarray`.
- Add test_dataarray_coord_reorder for the same-values-different-order
  reindex branch (previously only the unequal-values raise was
  covered).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Relocate `_as_dataarray_in_coords` and its helpers
(`_coords_to_dict`, `_named_pandas_to_dataarray`) from `model.py`
into `common.py`, alongside the existing `as_dataarray` they
parallel. Rename to `as_dataarray_in_coords` (no leading underscore)
since it is no longer file-local — other modules can import the
strict-coords variant when migrating call sites.

Pure relocation: no behavior change, no call-site changes beyond
`add_variables`'s import. Refs #723.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anches

Replace the unstack-while-loop / split named-check structure with a
single up-front "all axes named" check and a single
``DataFrame.stack(level=list(range(nlevels)), future_stack=True)``
call that collapses all column levels into the row MultiIndex in
one shot. Same observable behaviour, fewer moving parts, no
defensive unreachable branches.

Add tests covering the unnamed-axis fall-through path, the
empty-coords short-circuit in ``as_dataarray_in_coords``, and the
``MultiIndex``-on-a-dim ``continue`` in the validation loop.
Together with the restructure these bring the new helper code to
full patch coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pandas allows any hashable in ``pd.Index.names`` (tuples, ints,
etc.), but only strings map cleanly to xarray dim names. Reject
anything non-string up front so the pandas falls back to
``as_dataarray`` instead of producing a DataArray with an awkward
non-string dim name that downstream validation would reject with a
confusing "extra dimensions" error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Inputs without their own meaningful labels — numpy arrays, polars
Series, pandas with unnamed axes — fell through ``as_dataarray_in_coords``
via a short-circuit return. That meant:

- The default ``dim_0`` / ``dim_1`` axis names from ``as_dataarray``
  leaked into the result, so a pandas Series without an index name
  combined with another bound carrying a named coord produced a
  spurious 2-D variable.
- Shape mismatches surfaced further downstream as confusing
  "coordinates do not match" errors against the auto-generated
  ``RangeIndex``.

The fall-through now: (a) defaults ``dims`` to coords' keys so axes
get labelled correctly; (b) runs the same validate / expand /
transpose path as labelled inputs; (c) re-assigns coords from
``expected`` on the resulting DataArray so positional inputs align
to coords by position. A shape mismatch surfaces as xarray's clear
``conflicting sizes`` from ``assign_coords``. MultiIndex coords are
left alone (re-assigning a PandasMultiIndex emits a FutureWarning).

Replaces the tautological ``test_pandas_bound_with_unnamed_axis_falls_through``
(which sneaked past by naming the coord ``"dim_0"`` to match the
auto-generated dim) with ``test_positional_bound_aligns_to_coords``
that asserts actual positional alignment across numpy / Series /
DataFrame, plus ``test_positional_bound_wrong_size_raises_clear_error``
for the shape-mismatch path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o_dict

``reformulate_sos1`` / ``reformulate_sos2`` built the coords for
the indicator variable as ``[var.coords[d] for d in var.dims]``,
which is a list of ``xarray.DataArray`` coord objects. The rest
of linopy passes ``coords`` as a list of ``pd.Index``. The mix
slipped through under the old short-circuit fall-through but
broke once the helper started defaulting ``dims`` from
``_coords_to_dict(coords)`` — non-``pd.Index`` entries were
silently dropped, so ``len(dims) < len(coords)`` and xarray
raised ``different number of dimensions on data and dims: 2 vs 1``.

Use ``var.indexes[d]`` instead — it returns the actual
``pd.Index`` (regular or MultiIndex) for the dim and preserves
structure that ``pd.Index(coord.values, ...)`` would flatten.

Also widen ``_coords_to_dict`` to accept any entry with a
``.name`` (xarray DataArrays included) so a future caller passing
mixed types doesn't silently lose coords. The reformulator fix
removes the only known producer of mixed-type coords; this is
belt-and-suspenders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the permissive ``getattr(c, "name", None)`` check with an
explicit allow-list: ``pd.Index`` (named or not — unnamed silently
skip as before) and unnamed sequences (``list`` / ``tuple`` /
``range`` / ``numpy.ndarray``). Any other type (notably
``xarray.DataArray``, but also ``pd.Series`` and friends) now
raises ``TypeError`` with a hint to pass ``variable.indexes[<dim>]``
instead. This would have caught the SOS-reformulator bug at the
source instead of letting it surface as a confusing xarray error
about mismatched dim counts ten frames down.

Drop ``DataArray`` from the matching ``coords`` type hints in
``model.py`` / ``expressions.py`` so the documented and runtime
type sets agree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- _coords_to_dict: explicitly handle pd.MultiIndex — register under
  .name if set, raise TypeError with guidance if .name is missing
- _named_pandas_to_dataarray: use DataArray(df) directly for
  single-level DataFrames; reserve stack() for MultiIndex axes
- as_dataarray_in_coords: validate MultiIndex dims with .equals()
  instead of silently skipping them
- Move MultiIndex tests into dedicated TestAddVariablesMultiIndexCoords
  class with shared fixture
…nts (#725)

* fix(model): apply coords-as-truth rule to mask in add_variables/add_constraints

Routes ``mask`` through ``as_dataarray_in_coords(mask, data.coords)``
instead of ``as_dataarray(...) + broadcast_mask(...)``, so pandas
``Series`` / ``DataFrame`` masks missing a dimension are broadcast
to the variable / constraint shape (parallel to the bounds fix in
the previous PR). The ``add_variables`` ``mask`` type hint widens
to ``MaskLike`` to match ``add_constraints``.

The deprecation announced via ``FutureWarning`` in ``broadcast_mask``
("Missing values will be filled with False ... In a future version,
this will raise an error") is now in effect: masks whose
coordinates are a sparse subset of the data's coordinates raise
``ValueError`` instead of silently filling missing entries.
Mask dims not in the data raise ``ValueError`` instead of
``AssertionError`` for consistency with the bounds path.

``broadcast_mask`` had no other callers and is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Update doc/release_notes.rst

Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>
…on (#726)

* fix(model): apply coords-as-truth rule to mask in add_variables/add_constraints

Routes ``mask`` through ``as_dataarray_in_coords(mask, data.coords)``
instead of ``as_dataarray(...) + broadcast_mask(...)``, so pandas
``Series`` / ``DataFrame`` masks missing a dimension are broadcast
to the variable / constraint shape (parallel to the bounds fix in
the previous PR). The ``add_variables`` ``mask`` type hint widens
to ``MaskLike`` to match ``add_constraints``.

The deprecation announced via ``FutureWarning`` in ``broadcast_mask``
("Missing values will be filled with False ... In a future version,
this will raise an error") is now in effect: masks whose
coordinates are a sparse subset of the data's coordinates raise
``ValueError`` instead of silently filling missing entries.
Mask dims not in the data raise ``ValueError`` instead of
``AssertionError`` for consistency with the bounds path.

``broadcast_mask`` had no other callers and is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: unify as_dataarray; split broadcasting from coords validation

Closes #723. Folds the body of `as_dataarray_in_coords` into `as_dataarray`
and extracts the contract checks into `assert_compatible_with_coords`, so
linopy now has one broadcasting primitive and one validation companion.

`as_dataarray(arr, coords)` aligns the result against `coords` for every
input type: labels positional inputs (numpy / unnamed pandas / scalar) by
position, reindexes same-values-different-order, expands missing dims,
and transposes to coords order. Extra dims and disagreeing value sets on
shared dims pass through unchanged, so xarray broadcasting in expression
arithmetic keeps working.

`assert_compatible_with_coords(arr, coords)` enforces the strict contract
(`arr.dims ⊆ coords.dims`, plus exact coord-value equality on shared
dims). `add_variables` and `add_constraints` now call it after
`as_dataarray` for `lower` / `upper` / `mask`, replacing the deleted
`as_dataarray_in_coords` helper.

`_coords_to_dict` filters MultiIndex level coords out of
`xarray.Coordinates` inputs so the new strict-by-default path treats
`station` (and not its derived `letter` / `num` levels) as the dim.

Test suite: 3698 passed (no regressions). Two existing tests were
updated to reflect the new "coords is source of truth" semantics:
`test_as_dataarray_with_ndarray_coords_dict_set_dims_not_aligned`
(extra coord entries now broadcast in) and
`test_dataarray_extra_dims` (now triggers the subset check rather than
the value-mismatch check).

Microbenchmark in dev-scripts/benchmark_as_dataarray.py shows flat
timings vs the base branch on both add_variables-heavy and arithmetic-
heavy workloads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: dims= names unnamed coords; doctest the add_variables contract

Closes a silent-failure gap in the strict coords-as-truth path: when the
caller passed ``coords=[[1, 2, 3]], dims=["x"]`` to ``add_variables``,
``_coords_to_dict`` returned an empty mapping (unnamed sequences carry
no dim name), so the strict checks short-circuited and bounds with
extra dims or mismatched values flowed through unchecked, producing
variables with frankenstein outer-joined coord values.

``_coords_to_dict`` now accepts an optional ``dims`` argument that
names unnamed sequence entries by position. ``as_dataarray`` and
``assert_compatible_with_coords`` plumb it through; ``add_variables``
forwards ``kwargs.get("dims")`` to the assertions for ``lower`` and
``upper``. ``coords=[[1, 2, 3]], dims=["x"]`` now enforces the same
contract as ``coords={"x": [1, 2, 3]}`` or
``coords=[pd.Index([1, 2, 3], name="x")]``.

Docstring of ``add_variables.coords`` documents the contract
(subset-of-dims, dim order, value match with auto-reindex, missing-dim
broadcast) and includes four doctests pinning it: the extra-dim raise,
the value-mismatch raise, the same-values-different-order auto-reindex,
and the unnamed-coords-plus-dims opt-in.

Test suite: 3698 passed (parity with the previous commit on this
branch). ``pytest --doctest-modules linopy/model.py -k add_variables``
also green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add align_to_coords with semantic validation error messages

Introduce align_to_coords to wrap as_dataarray and assert_compatible_with_coords
with user-facing labels (lower bound, upper bound, mask). Errors now name the
argument and distinguish extra dimensions, coordinate mismatches, and conversion
failures. Extend mask validation to use coords+dims= when provided.

Co-authored-by: Cursor <cursoragent@cursor.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor(model): simplify mask align; preserve TypeError in align_to_coords

Three cleanups on top of align_to_coords:

- Drop the trailing ``.broadcast_like(data.labels)`` in ``add_variables``
  and ``add_constraints`` mask paths. ``as_dataarray`` already expands
  missing dims to ``coords`` shape, so the broadcast was a no-op.
- Stop overriding the caller's ``dims=`` in the ``add_variables`` mask
  path when ``coords is None``. The previous code stripped ``dims`` and
  forced ``dims=data.dims``; with ``data.coords`` being an xarray
  ``Coordinates`` with already-named dims, the user's ``dims`` is
  harmless to forward and the override was just hiding intent. Mask
  now goes through one ``align_to_coords`` call regardless of whether
  ``coords`` is supplied.
- Split the exception handler in ``align_to_coords``: ``TypeError`` from
  unsupported input types is re-raised as ``TypeError`` (still labeled),
  while ``ValueError`` / ``CoordinateValidationError`` stay
  ``ValueError``. Preserves the original type signature for callers
  that want to ``except TypeError``.

New test ``test_align_to_coords_preserves_type_errors`` pins the
TypeError pass-through. Suite: 3703 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: rename assert_compatible_with_coords to validate_alignment

Per PR review: align on the project's `validate_*` naming convention
and remove the implicit "AssertionError" connotation of `assert_*`.
Pairs naturally with `align_to_coords`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
FBumann and others added 2 commits May 27, 2026 12:50
#729 made `.name` required on `pd.MultiIndex` sequence-form coord entries
(xarray needs a single dim name for the flattened index). test_repr.py was
the only remaining call site missing the assignment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`xarray.Coordinates.dims` is typed `Hashable`, so the dict-comprehension
return and the `sorted()` calls in the validation message tripped mypy.
The function's other branches already accept `c.name` / `dim_names[i]`
(both Hashable), so widening the return type is the honest signature.

Also: drop `.data` from the add_variables doctest — use the public
`v.lower` property instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented May 27, 2026

@FabianHofmann Although the breaking changes are quite small, id like to test this against pypsa and flixopt, to make sure they dont use any of the now raising paths.
Agreed?
Could you do pypsa?

…#733)

* refactor(common): clarify coords-entry rules and tighten error labels

Stacks on top of #732. Three small follow-ups from PR review:

- Remove dead `broadcast_mask` (claimed removed in #732, was still present).
- `as_dataarray`: normalize bare-tuple coord entries to lists so
  `coords=[(0, 1, 2)]` behaves identically to `coords=[[0, 1, 2]]`
  (xarray reads `(a, b)` as `(dim_name, values)` and would otherwise
  raise a confusing error).
- `align_to_coords`: pre-validate coords via `_coords_to_dict` so
  TypeErrors from a bad `coords` argument propagate with their own
  message instead of being relabeled "<label> could not be aligned to
  coords: ...", which previously misdirected users to inspect the
  bound/mask.

Docs: replace the prose paragraph in `_coords_to_dict`'s docstring
with an explicit rules table covering every container form and
sequence-entry case (named/unnamed `pd.Index`, `pd.MultiIndex`, bare
sequences, with/without positional `dims=`).

Tests: new `TestCoordsToDictRules` class in `test_common.py` mirrors
the docstring table one-test-per-rule so the executable spec stays
visibly aligned with the documented contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(common): allow dims= to name an unnamed pd.MultiIndex

Mirrors the existing rule for unnamed pd.Index: an unnamed MultiIndex
paired with a positional dims=[i] entry now gets its flat .name set
to dims[i] on a shallow copy (caller's MultiIndex is not mutated).
Per-level names are preserved.

Removes the asymmetry between Index and MultiIndex in _coords_to_dict:
both can now be named either inline (.name) or by position (dims=[i]).
An unnamed MultiIndex with no positional dims still raises TypeError
since xarray requires a single flat name.

Adds one rule-table row and two tests
(test_unnamed_multiindex_with_dims_uses_dims,
test_unnamed_multiindex_without_dims_raises).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(common): scope tuple-normalize check to lists/tuples with tuple entries

The previous `not isinstance(coords, Coordinates | Mapping)` form was
broad and rebuilt `coords` as a fresh list on every call (even when no
tuple entries were present). Switch to a positive
`isinstance(coords, list | tuple)` guard with a short-circuit
`any(isinstance(c, tuple) for c in coords)` check, so the comprehension
only runs when there is actually a tuple to normalize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FBumann FBumann requested a review from FabianHofmann May 27, 2026 12:32
FBumann and others added 2 commits May 27, 2026 14:55
…anges

Two pre-existing Upcoming-Version bullets from master had been
dropped on this branch, most likely as merge-conflict casualties:

- ``LinearExpression.where`` doc + ``BaseExpression.variable_names`` entry
- Mosek basic/IPM solution-inspect fix

Restore both verbatim from master.

Also add an explicit Breaking Changes bullet for the coord-as-truth
behaviour changes that previously lived only under Bug Fixes: the
mask FutureWarning -> ValueError flip, the AssertionError -> ValueError
flip on extra mask dims, and the new TypeError on an unnamed
pd.MultiIndex without a positional dims=[i] entry. The Bug Fixes
entries still carry the migration detail; the Breaking Changes bullet
points there so readers scanning by section don't miss the rename of
warnings to errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Merge the two Bug Fixes bullets (bounds + mask) into one. The
  separation read as "same fix, applied twice" without adding info;
  one bullet covers both with the same migration detail.
- Shorten the Breaking Changes bullet — it duplicated the v0.6.3
  ``FutureWarning`` and ``AssertionError`` parentheticals already
  in Bug Fixes; keep only the FutureWarning summary and the
  pd.MultiIndex addition.
- Collapse the Internal as_dataarray bullet from 8 wrapped lines to
  one, and drop the "Validation errors name the argument" UX
  detail — accurate but not structural enough for a release note.

No facts removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented May 27, 2026

@FabianHofmann This is the most relevant PR now i think, to continue working on the others

@FabianHofmann
Copy link
Copy Markdown
Collaborator

reviewing now, we should be able to pull this in today

@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented May 28, 2026

@FabianHofmann I tested this again with flixopt and im getting 0 failed tests

@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented Jun 1, 2026

@FabianHofmann
Two thoughts:

  1. Could some of the multiindex stuff be moved from as_dataarray() to validate_alignment() ? Only as a question, without an opinion from my side
  2. Why use _as_dataarray_lax() in matmul? I wanted to not use it internally.

@FabianHofmann
Copy link
Copy Markdown
Collaborator

FabianHofmann commented Jun 1, 2026

@FabianHofmann Two thoughts:

  1. Could some of the multiindex stuff be moved from as_dataarray() to validate_alignment() ? Only as a question, without an opinion from my side
  2. Why use _as_dataarray_lax() in matmul? I wanted to not use it internally.
  1. good question, let me take another look
  2. example: a has dims (time, name) and b has (name, location). when running a @ b, b is going through as_dataarray with coords and dims alignment to a. that is, it is broadcasted for additional dimensions it does not cover (here time) and we get a b_intermediate with (time, name, location). the resulting operation collapses all overlapping dimenions -> result only contains (location). I am happy to reconsider the general logic here. or add a flag in as_dataarray. (should have added a test too)

FBumann added a commit that referenced this pull request Jun 1, 2026
Per review feedback on #717, the convention now names its object scope:
operations between a linopy object and any non-linopy operand behave
exactly like operations against the constant-only expression holding the
same values and coordinates, for every operator and operand position.

The alignment group resolves the previously open question on unlabeled
data: unlabeled operands (numpy arrays, lists, polars Series) pair with
dimensions by size, and the pairing must be unambiguous — a length-4
array against dims (a: 4, b: 4) raises instead of silently picking the
leading dim. Marked TODO: implementation builds on the coords-as-truth
seam from #732 and lands after it.

Slice H tests pin the substitutability that already holds: per-operator
raw-vs-wrapped equivalence, distributivity/associativity across mixed
operand types, identical §8 raises on either route, and the type-decided
divisor exception.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented Jun 1, 2026

@FabianHofmann I think this surfaces a design weakness. I propose splitting the conversion logic once more. Ill do a quick PR for it.

FBumann added a commit that referenced this pull request Jun 1, 2026
The function never aligns anything — it broadcasts and raises on any
mismatch it cannot resolve by broadcasting alone. "Align" is also the
word that invites join= proposals (aligns take joins, broadcasts do
not), so the name now states what it is: the same broadcast as
broadcast_to_coords with a strict failure mode (zip(strict=True)
semantics).

Error messages keep the "could not be aligned to coords" wording so
tests in the base branch (#732) stay untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FBumann added a commit that referenced this pull request Jun 2, 2026
…strict=...)

Per review discussion: one public function instead of two, with strict as
a keyword flag.

- strict=True (default): any mismatch with coords raises, naming label in
  the error — the former strict_broadcast_to_coords.
- strict=False: mismatches pass through for downstream xarray alignment —
  the former loose broadcast_to_coords, used by arithmetic.

Strict is the default so that forgetting the flag adds safety rather than
silently dropping validation. MI handling preserved exactly per mode
(strict: silent partial / raise on gap; non-strict: EvolvingAPIWarning) —
the scenario-B deprecation warnings land separately in #732.

Call sites: model.py bounds/mask drop the long name (strict is default);
arithmetic and as_expression pass strict=False explicitly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FBumann and others added 2 commits June 2, 2026 12:14
… ladder (#737)

* refactor(common): split DataArray conversion into a 3-rung strictness ladder

Replace the as_dataarray + _as_dataarray_lax pair (and the
enforce_level_coverage flag) with three public entry points, each
including the previous one:

- as_dataarray: convert only (the former _as_dataarray_lax). Used by
  __matmul__, where dims missing from the constant must not be
  broadcast in (they would be contracted away as common dims).
- broadcast_to_coords: convert + broadcast against coords (the former
  broadcasting as_dataarray). Used by expression arithmetic.
- align_to_coords: convert + broadcast + enforce the coords contract.
  Used by add_variables / add_constraints (unchanged signature).

The broadcasting mechanics live in one shared private core
(_broadcast_core) that reports MultiIndex-level projections instead of
applying policy. The entry points decide what a partial projection or
coverage gap means: broadcast_to_coords warns (arithmetic convention),
align_to_coords raises (coords contract). This removes the
enforce_level_coverage flag and keeps validation concerns out of the
broadcasting layer.

No behavior changes; all call sites keep their semantics. New tests pin
the ladder contrasts and the matmul dim-contraction rules.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: shorten release-notes bullet on conversion helpers

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(common): rename _broadcast_core to _broadcast_to_coords

Private-twin convention: _broadcast_to_coords is the raw implementation
of broadcast_to_coords (returns projection events instead of applying
policy), shared with align_to_coords.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(expressions): as_expression converts constants with the broadcast rung

The constraint lhs/rhs setters call as_expression(value, model,
coords=self.coords, dims=self.coord_dims); forwarding those kwargs to
the convert-only as_dataarray dropped the broadcasting these setters
relied on (e.g. a MultiIndex-level-indexed rhs failed with an xarray
AlignmentError instead of being projected onto the stacked dim).

Use broadcast_to_coords instead. The other as_expression callers pass
only dims (no coords), for which both rungs behave identically.

Adds regression tests for the rhs setter: missing-dim broadcast and
MultiIndex-level projection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(common): rename align_to_coords to strict_broadcast_to_coords

The function never aligns anything — it broadcasts and raises on any
mismatch it cannot resolve by broadcasting alone. "Align" is also the
word that invites join= proposals (aligns take joins, broadcasts do
not), so the name now states what it is: the same broadcast as
broadcast_to_coords with a strict failure mode (zip(strict=True)
semantics).

Error messages keep the "could not be aligned to coords" wording so
tests in the base branch (#732) stay untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(common): apply review polish to the strictness ladder

- Document the one non-obvious policy in strict_broadcast_to_coords:
  partial-level broadcasts are silent (bounds-broadcast feature), unlike
  the warning on the broadcast rung.
- Unify the first parameter name across the ladder (value -> arr).
- Un-invert the warning-policy loop in broadcast_to_coords.
- Rename the test whose name forced an awkward signature wrap to a
  behavior-oriented name (test_extra_dims_pass_broadcast_rung_fail_strict_rung).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(common): add numpydoc Parameters/Returns to the three public rungs

Parameter entries carry descriptions only — types live in the function
signatures.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(common): unify the broadcast rungs into broadcast_to_coords(strict=...)

Per review discussion: one public function instead of two, with strict as
a keyword flag.

- strict=True (default): any mismatch with coords raises, naming label in
  the error — the former strict_broadcast_to_coords.
- strict=False: mismatches pass through for downstream xarray alignment —
  the former loose broadcast_to_coords, used by arithmetic.

Strict is the default so that forgetting the flag adds safety rather than
silently dropping validation. MI handling preserved exactly per mode
(strict: silent partial / raise on gap; non-strict: EvolvingAPIWarning) —
the scenario-B deprecation warnings land separately in #732.

Call sites: model.py bounds/mask drop the long name (strict is default);
arithmetic and as_expression pass strict=False explicitly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(common): require label when broadcast_to_coords is strict

Restores the contract align_to_coords always had: strict-mode errors must
name their subject ("lower bound could not be aligned..." rather than
"Value could not be aligned..."). Enforced both statically (overloads:
strict=True requires label: str, strict=False forbids it) and at runtime
(TypeError).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(common): deprecate implicit MI-level projection everywhere (scenario B)

Per the #737 review discussion and Fabian's decision: implicit level
projection is deprecated and will raise under the v1 convention, so the
EvolvingAPIWarning now fires in both modes of broadcast_to_coords — the
MI check is the same for every use case:

- input missing a whole level: warn (strict and non-strict)
- coverage gap (level combinations without a value): warn (non-strict) /
  raise (strict — no downstream layer to defer the NaN to)

Warning emission lives in one helper, _warn_implicit_projections, with a
TODO(#738) to migrate to LinopySemanticsWarning once #717 lands.

Also clarifies the MultiIndex terminology everywhere: an MI dim has
*levels* and *level combinations* (one tuple per position). Docstrings
carry the glossary, the coverage-gap error names the missing
combinations explicitly, and "entry" is gone from messages.

User-facing: add_variables / add_constraints with per-period-style
bounds now emit the deprecation warning (PyPSA multi-investment).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ray-coords breaking change

- The implicit MultiIndex-level projection deprecation (scenario B) now
  has its own entry under Deprecations, where PyPSA users scanning for
  upcoming warnings will find it.
- Breaking Changes gains the CoordsLike narrowing: DataArray entries in
  sequence-form coords raise TypeError (pass variable.indexes[dim]
  instead).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented Jun 2, 2026

Note

Claude Code review finding — for the MI follow-up, not blocking this PR (the hole pre-dates the #737 merge).

Known hole: unnamed-MultiIndex-level inputs bypass strict validation

A bound indexed by a MultiIndex with unnamed levels passes strict=True validation silently, even when its entries don't match coords:

idx = pd.MultiIndex.from_product([[2020, 2030], ['t1', 't2']], names=('period', 'timestep'))
idx.name = 'snapshot'
coords = xr.Coordinates.from_pandas_multiindex(idx, 'snapshot')

# unnamed levels — note: no names=(...)
sparse_unnamed = pd.Series({(2020,'t1'): 1.0, (2030,'t2'): 2.0})

broadcast_to_coords(sparse_unnamed, coords, dims=['snapshot'], label='lower bound')
# → returns a 2-entry result, NO error — should raise (entries don't match the 4-entry coords)

With named levels the same input correctly raises:

ValueError: lower bound could not be aligned to coords: no value for 2 level combination(s)
of MultiIndex dimension 'snapshot': (2020, 't2'), (2030, 't1'). ...

Cause: the unnamed-level MI can't be matched to the coords' levels by name, so no projection happens; the input keeps its own 2-entry MultiIndex under the snapshot dim name, and validate_alignment's MI branch doesn't catch the size/entry mismatch.

Suggested fix direction (for the MI follow-up): in validate_alignment, when both sides are MultiIndex-backed but with different lengths or entries, raise regardless of level names — the names matter for projection, not for equality checking.

Real users (PyPSA) always have named levels, so low practical impact — but "strict" shouldn't have a silent bypass.

@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented Jun 2, 2026

@FabianHofmann can you look into the discovered MI gap?

@FabianHofmann
Copy link
Copy Markdown
Collaborator

yes, running pypsa CI now and report back

@FabianHofmann
Copy link
Copy Markdown
Collaborator

CI on pypsa runs through, great. just pushing a fix on MIs

validate_alignment unwrapped only bare pd.MultiIndex coord entries, so
Coordinates-backed (DataArray) MI dims read as non-MI and skipped the
equality check. Use _as_multiindex on both sides to catch mismatches
regardless of level names.
@FabianHofmann
Copy link
Copy Markdown
Collaborator

@FBumann let's merge this. I would suggest we make a follow up PR only doing some refactoring:

  • move all broadcasting and alignment code from common.py to a new module alignment.py
  • move all related tests to test_alignment.py make more use of test classes to structure them better

@FabianHofmann FabianHofmann merged commit 3f824cc into master Jun 2, 2026
19 of 20 checks passed
@FabianHofmann FabianHofmann deleted the fix/bounds-coords-broadcast branch June 2, 2026 14:00
FBumann added a commit that referenced this pull request Jun 3, 2026
…ts (#742)

* refactor: move conversion/broadcasting/alignment code to linopy/alignment.py

Pure move, no behavior change. The new module owns the seam between user
input and linopy's labelled arrays:

- coords parsing: _coords_to_dict, _as_index, _as_multiindex
- conversion: get_from_iterable, pandas_to_dataarray, numpy_to_dataarray,
  _named_pandas_to_dataarray, fill_missing_coords, as_dataarray
- MultiIndex projection: _LevelProjection, _project_onto_multiindex_levels,
  _warn_implicit_projections
- broadcasting: _broadcast_to_coords, broadcast_to_coords
- validation: validate_alignment
- symmetric alignment: align

common.py keeps the general utilities (formatting, label indexes, polars
helpers, decorators). Importers (model, expressions, variables, __init__,
tests) updated; no re-exports.

Follow-up requested by @FabianHofmann in #732.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: restructure alignment tests into test_alignment.py

One class per concept in linopy.alignment, mirroring the module's public
surface:

- TestAsDataarrayFrom{Pandas,Numpy,Scalar,DataArray} + MultiIndexCoords
- TestCoordsToDict (the coords-entry naming rules)
- TestBroadcastToCoords (strict=False mechanics)
- TestMultiIndexProjection (projection values, deprecation warnings,
  coverage gaps — the legacy/v1 fork point for #717)
- TestStrictMode (strict=True contract)
- TestValidateAlignment
- TestAlign

Shared fixtures (mi_index / mi_coords / by_level1) replace the repeated
MultiIndex setup; the pandas dims-naming and numpy labeling tests are
consolidated into parametrized tables. test_common.py keeps the utility
tests. Full suite count unchanged (3202) — no coverage lost.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: close coverage gaps in alignment.py and the MI coords serialization

alignment.py: 97% -> 99%. New edge-case tests: bare-string dims, 0-d
arrays, fill_missing_coords type check, partially-named MI levels,
gap detection with extra dims, gap-error truncation (>5 missing
combinations). The two remaining uncovered lines are defensive branches
for inputs outside the DimsLike contract (non-iterable dims).

common.py: 88% -> 90%. The MultiIndex round-trip through
coords_to_dataset_vars / coords_from_dataset (used by CSRConstraint)
had zero coverage; now pinned.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Update test/test_alignment.py

Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>

* Update test/test_alignment.py

Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>
FBumann added a commit that referenced this pull request Jun 3, 2026
Brings in the coords-as-truth stack (#732), the alignment.py module split
(#742), has_terms (#743), and the MatrixAccessor caching (#716).

Conflict resolutions (consistent rule: keep this branch's v1/legacy
dispatch structure, use master's conversion calls inside it):

- expressions.py: _add_constant, _apply_constant_op_{v1,legacy}, and
  to_constraint use broadcast_to_coords(..., strict=False) instead of
  as_dataarray; SUPPORTED_CONSTANT_TYPES -> CONSTANT_TYPES.
- variables.py: to_linexpr converts via broadcast_to_coords(strict=False),
  then applies the v1/legacy absence handling.
- __init__.py: align from linopy.alignment + LinopySemanticsWarning export.

Test adaptations for post-#732 APIs and semantics:

- test_legacy_violations: name the MultiIndex coords entry (required
  since #732).
- test_linear_expression: the masked-addend tails of test_nterm and
  test_variable_names pin legacy absence behavior; split into
  @pytest.mark.legacy / @pytest.mark.v1 pairs (section 6 divergence).

Full suite under both semantics: 6446 passed, 514 skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FBumann added a commit that referenced this pull request Jun 3, 2026
…cenario B v1)

Closes the two integration points between the alignment layer (#732/#742)
and the v1 semantics infrastructure:

- _warn_implicit_projections -> _enforce_implicit_projections: under
  legacy semantics, the implicit MultiIndex-level projection deprecation
  now goes through warn_legacy() / LinopySemanticsWarning (#738,
  replacing EvolvingAPIWarning, which stays piecewise-only); under the
  v1 convention it raises ValueError (sections 8 and 11) — the
  projection must be written explicitly (scenario B of the #732/#737
  discussion).

- as_expression no longer swallows the underlying conversion error:
  "Cannot convert to LinearExpression: <original message>" so the v1
  guidance reaches the user.

Tests: the MI-projection deprecation tests in test_alignment,
test_variable, test_constraint, and test_linear_expression are marked
@pytest.mark.legacy and assert LinopySemanticsWarning; each gains a
@pytest.mark.v1 counterpart asserting the v1 raise. Full suite under
both semantics: 6446 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FBumann added a commit that referenced this pull request Jun 3, 2026
The convention was silent on inputs indexed by levels of a stacked
MultiIndex dimension — the question resolved as scenario B in the
#732/#737 discussion. Now written into section 11:

- level coords are auxiliary coordinates, so a level-named operand dim
  is a section-11 conflict: it raises, with the explicit .sel()
  projection as the documented recipe;
- a full reconstruction of the MultiIndex is not a conflict (same
  coordinate spelled differently, aligns under section 8);
- legacy projects implicitly and warns; the projection is removed
  at 1.0 (added to legacy-removal.md).

Also: the #736 TODO no longer claims #732 is unmerged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add_variables: DataArray bounds with missing dimensions yield wrong dimension order

2 participants