perf(render_points): drop the AnnData hack + fix the categorical cliff #730
Merged
Conversation
render_points built a full AnnData over every point (X=xy, obs=coords) just to reuse the legacy color machinery — incurring AnnData's O(n) index-uniqueness check + dtype cast on every call, regardless of backend or color. The modern ColorSpec/resolve_color pipeline already carries coords (points df), color (get_values merge + color_spec), and the legend (color_spec), so the AnnData is vestigial. Remove it: feed matplotlib coords from points["x"/"y"], let the existing get_values merge supply table obs/var colors, and keep the original table in sdata_filt so resolve_color still reads user uns palettes. Also drop the now-dead `adata` parameter threaded through _add_legend_and_colorbar / _decorate_axs / _render_centroids_as_points (none read it). 10M-transcript render: ~3x faster on both backends (no-color 11.2s->3.5s mpl, 8.6s->3.2s ds; continuous 9.2s->2.6s mpl, 8.2s->2.6s ds).
The per-point color vector alpha-strip used np.unique(return_inverse=True), which sorts millions of strings (argsort dominated the datashader render: ~1s at 10M). pd.factorize dedups in O(n) via hashing with no sort and produces a byte-identical per-point result. Modest win for the no-color/categorical paths.
… limit Coloring by a high-cardinality categorical (e.g. Xenium points by gene, ~3000 genes) spent ~10s building the legend: scanpy's _add_categorical_legend adds one autoscaling artist per category, so matplotlib re-autoscales O(categories^2) (sticky_edges called ~categories^2 times). Past len(default_102)=102 categories scanpy already colors every point uniform grey, so a per-entry legend carries no information anyway. Skip it with a warning above that limit (tied to scanpy's palette so the two stay in sync). 2M points x 3085 genes: 16.9s -> 6.5s.
When every point resolves to the same colour — notably past scanpy's 102-colour palette, where all categories become uniform grey — datashader's per-category ds.by aggregate + composite is pure waste: the output is byte-identical to a plain single-colour count render. Detect the uniform colour vector and route to the cheap count path. 2M points x 3085 genes: 6.0s -> 0.86s (~7x), byte-identical output, no spurious colorbar; low-cardinality categoricals are unaffected.
When every marker resolves to the same colour (no color / single colour / collapsed grey), _scatter_points handed ax.scatter a per-point colour array, forcing matplotlib's per-point colour-mapping machinery — the dominant cost at scale. Detect a uniform fixed-width-string colour vector (cheap vectorised compare) and pass a scalar color= instead. Visually identical (sub-tolerance edge antialiasing); numeric/continuous vectors keep the c=/cmap/norm path. 10M no-color matplotlib render: ~3.5s -> ~2.4s. Mirrors the datashader single-colour collapse.
The uniform-colour scalar `color=` path produces a sub-pixel edge-antialiasing difference vs the previous per-point `c=` array (markers identical in position, size, and colour). Two single-colour matplotlib stacking baselines exceeded TOL=15 on their few large markers; regenerated from CI. Diff is edge-only (verified), not a rendering change.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #730 +/- ##
==========================================
- Coverage 79.38% 79.34% -0.04%
==========================================
Files 17 17
Lines 4604 4600 -4
Branches 1031 1030 -1
==========================================
- Hits 3655 3650 -5
- Misses 599 600 +1
Partials 350 350
🚀 New features to boost your workflow:
|
Review cleanups for the render_points perf work: - Unify the "is the colour vector uniform?" check: matplotlib's _scatter_points now reuses _color_vector_is_uniform instead of an inline copy, and the helper gains a fixed-width-string fast path (vectorised compare) so the datashader collapse no longer pays a full nunique hash on every categorical render. - Skip the per-point alpha-strip when col_for_color is None (no-colour / collapsed single-colour): _ds_shade_categorical already strips color_vector[0] there, so the O(n) factorize + N-array rebuild was wasted (~720MB at 20M). - Collapse the two near-identical ax.scatter() calls in _scatter_points into one with conditional colour kwargs. Behaviour-preserving: collapse output still byte-identical, low-cardinality categoricals unaffected, 151 non-visual tests pass.
…alettes The skipped-legend warning claimed points are "uniform grey" past the limit, but that only holds for scanpy's default palette — a custom cmap/palette gives distinct colors for >102 categories (verified: cmap='viridis' + 150 cats → 150 distinct colors). Reword to the palette-agnostic, true reasons (a per-entry legend that large is unreadable and O(categories^2) slow to build). The skip itself is unchanged and defensible regardless of palette.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
render_pointsbuilt a full per-pointAnnDataon every call (even no-color), incurring AnnData's O(n) index-uniqueness check + dtype cast on both backends before drawing — so datashader never paid off. This removes that hack and the high-cardinality-categorical legend cliff.10M transcripts: ~3× faster general (no-color 11→3.5s, continuous 9→2.6s); ~20× for Xenium color-by-gene (16.9→0.9s). Output unchanged within visual-test tolerance.
Changes
get_valuesmerge, legend fromColorSpec.color=for uniform color instead of a per-point array.Behavior notes: legend skipped >102 categories (warning); single-color categorical datashader renders as a count (byte-identical). Two single-color baselines shifted (sub-pixel antialiasing) and were regenerated.