perf(matrices): cache MatrixAccessor properties with cached_property#716
Conversation
Convert the remaining `@property` methods on `MatrixAccessor` (`c`, `Q`, `sol`, `dual`) to `@cached_property`. After PyPSA#630, vlabels/clabels/A/b/sense/lb/ub are materialised once in `__init__`, but these four properties still recomputed on every access. Each `MatrixAccessor` is short-lived (rebuilt on every `model.matrices` call), so cache invalidation is moot: the instance never sees a mutated model. Also fix a latent double-build in `expressions.BaseExpression._map_solution`: `m.matrices.sol` followed by `m.matrices.vlabels` constructed two accessors (each running `_build_vars`+`_build_cons`). Bind once to `M`. Benchmark on a 360k-var / 1.2k-constraint model (Python 3.12, scipy 1.17, numpy 2.4): cold (first) warm (repeat) master: 1.0 ms 1.1 ms (recomputed) patch: 1.2 ms 0.4 us (~2500x on warm) Repeat access happens in `solvers.Highs._build_solver_model` (M.c / M.Q both read at lines 1278/1299 and again in Gurobi/Mosek paths at 1586-1587, 2860-2861) and anywhere a caller assigns `M = model.matrices` then reads multiple times.
|
@MaykThewessen thanks for the pr. if I am not mistaken we explicitly decided not using the cached property (@coroa) due to unexpected non-aligned result when underlying data changed. we would need to make sure that the cached is released properly. we can do this in #718 in the update function or alternatively make sure here we get rid of the the duplicated calls |
|
EDIT: Sorry, i was dumb |
| M = m.matrices | ||
| sol = pd.Series(M.sol, M.vlabels) |
There was a problem hiding this comment.
This is the change that we would want to apply everywhere!
|
Ups .. sorry, should have properly read the PR. I just read Fabian's comment and made assumptions. You are explicitly not trying to make matrices a cached_property, but only the objective wise components. I don't care, that is a very minor issue as your benchmarking shows 1ms -> 0.4 micros. . That's not worth optimizing for. But you are completely correct about the unrelated fix in map_solution! |
FabianHofmann
left a comment
There was a problem hiding this comment.
okay, also read again. it makes sense. thanks @MaykThewessen
Brings in the coords-as-truth stack (#732), the alignment.py module split (#742), has_terms (#743), and the MatrixAccessor caching (#716). Conflict resolutions (consistent rule: keep this branch's v1/legacy dispatch structure, use master's conversion calls inside it): - expressions.py: _add_constant, _apply_constant_op_{v1,legacy}, and to_constraint use broadcast_to_coords(..., strict=False) instead of as_dataarray; SUPPORTED_CONSTANT_TYPES -> CONSTANT_TYPES. - variables.py: to_linexpr converts via broadcast_to_coords(strict=False), then applies the v1/legacy absence handling. - __init__.py: align from linopy.alignment + LinopySemanticsWarning export. Test adaptations for post-#732 APIs and semantics: - test_legacy_violations: name the MultiIndex coords entry (required since #732). - test_linear_expression: the masked-addend tails of test_nterm and test_variable_names pin legacy absence behavior; split into @pytest.mark.legacy / @pytest.mark.v1 pairs (section 6 divergence). Full suite under both semantics: 6446 passed, 514 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Convert the remaining
@propertymethods onMatrixAccessor(c,Q,sol,dual) tofunctools.cached_property. After #630,vlabels,clabels,A,b,sense,lb,ub,vtypesare already materialised once in__init__. The four remaining properties still recompute on every access, which shows up in solver paths that read the same attribute twice on oneMatrixAccessorinstance.Also fix a small adjacent issue in
expressions.BaseExpression._map_solution:m.matrices.solfollowed bym.matrices.vlabelsquietly constructed twoMatrixAccessorinstances (each running_build_vars+_build_cons). Binding to a localM = m.matricesreuses one.Why this is safe
Each
MatrixAccessoris short-lived:Model.matricesis itself a@propertythat returns a fresh instance on every access (linopy/model.py:350-352). So the cache lives at most as long as the local variable a caller holds, and the accessor never sees a mutated model. No newclean_cached_propertiesplumbing is needed.Where the gain shows up
Multiple solver builders read the same property twice within one
M = model.matrices:solvers.Highs._build_solver_model—M.cat line 1278,M.Qat 1299; both can be re-read in user code that inspects the accessor after solve.solvers.Gurobi._build_solver_model—M.Qat 1586, used again at 1587 in the QP branch.solvers.Mosek._build_solver_model—M.Qat 2860, again at 2861.Repeat reads of
M.sol/M.dualare also common in post-solve introspection.Benchmark
Model: 360,600 variables, 1,200 constraints, solved with HiGHS (Python 3.12, scipy 1.17, numpy 2.4):
Repro script:
Tests
pytest test/test_matrices.py test/test_objective.py test/test_solution_lookup.py test/test_model.py test/test_constraints.py test/test_linear_expression.py test/test_quadratic_expression.py→ 399 passedpytest test/test_optimization.py -k "not modified_model and not netcdf"→ 241 passed, 2 skipped(Pre-existing master failures unrelated to this PR:
test_modified_model[*]andtest_model_to_netcdf_*— both fail onupstream/masterwith the same scipy/numpy stack.)Files touched
linopy/matrices.py(+4@cached_property, +1 import)linopy/expressions.py(1 line: bindm.matricesto local before reading two attributes)