damacy rewrite PR-2: planner consumes prefetch handles#123
Open
nclack wants to merge 7 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reviewer: read
dev/damacy_rewrite_pr2.mdfirst. Function-by-function plan for the whole PR.PR-2 collapses the duplicate metadata fetch path the planner inherited from PR-1: the prefetcher already pinned
array_meta/shard_index/chunk_layoutin the prefetch caches, but the planner was independently calling the legacy synchronouszarr_meta_cache/zarr_shard_cache. After this PR, the planner reads everything by handle from the prefetch caches, the legacy caches are deleted, and the public config + stats surfaces are reshaped to match.Glossary
{ slot, generation }pair the prefetcher uses to address a pinned cache entry. Looked up viaprefetch_cache_try_get.planner_plan. Carries(uri, aabb)plush_meta/h_shards[]/h_layouthandles. Replaces the baredamacy_sample*array.prefetch_cacheinstances inside damacy. Pinned byprefetcher_release_batchvia the batch gate.NOTFOUND.planner_plan. Deleted by this PR.Summary
Look first at
src/planner/planner.c.planner_plannow readszarr_metadataviaprefetch_cache_try_get(array_meta_cache, sample->h_meta), walkssample->h_shards[]for the per-shardshard_index_value, and pullschunk_layoutthe same way — no copies, no pin management inside the planner. The handles are threaded through a newplanner_sampletype thatdamacy_plan.c::plan_reservebuilds by stealing the relevant fields from eachprefetcher_ready.Non-obvious decisions worth a second look:
Prefetcher tolerates missing shards now.
advance_from_shardused to fail the entire sample if any shard handle errored. After this PR,DAMACY_NOTFOUNDper-shard is treated as "shard file absent — planner emits fill"; other errors still fail. The planner reads eachh_shards[i]per-shard inside the iterator loop and callsprefetch_cache_queryon the NULL-return case to distinguish fill vs. real failure. Test:tests/test_prefetcher.c::test_missing_shard_reaches_ready.chunk_layout_fetchdegrades to success/NULL. When the shard_index_cache has no usable entry (missing shard, or non-blosc codec), the chunk_layout entry resolves to success with NULL value rather thanDAMACY_DECODE. The planner readstry_get → NULLas "no blosc1 layout, decoder uses caps". The non-blosc path was already shaped that way after PR damacy rewrite PR-1: file split + prefetcher producer #121; this PR extends the same NULL-value contract to the no-shard case so layout never gates on missing shard data.prefetch_handlemoved to its own header.prefetch_cache.hcarries a_Atomic uint64_tinprefetch_gatethat nvcc rejects in CUDA TUs.planner.honly needs the PODprefetch_handle, so it gets a minimal new header (src/prefetch/prefetch_handle.h);prefetch_cache.his no longer transitively pulled into CUDA TUs through the planner.Public ABI break in
damacy_config.tuninganddamacy_stats.n_zarrs_meta_cache/n_shards_meta_cache→n_array_meta_cache/n_shard_index_cache/n_chunk_layout_cache(new).zarr_meta_hits/.../shard_idx_hits/...collapse into nestedarray_meta.{hits,misses}/shard_index.{hits,misses}/chunk_layout.{hits,misses}. Python binding, bench JSON output, and bench report column names all updated in the same commits.Three legacy tests deleted.
test_chunk_layout.c/test_zarr_meta_cache.c/test_zarr_cache_threading.c(PR-2 part 1) andtest_zarr_shard_cache.c(PR-2 part 2). Their behavior is exercised bytest_chunk_layout_cache.c/test_array_meta.c/test_shard_index.c/test_prefetcher.cagainst the prefetch implementations.Test plan
test_missing_shard_reaches_ready(new): unlinks a shard file, drains the prefetcher, asserts the sample reachesPREFETCHER_READYand the per-shard handle queries asERROR/NOTFOUND.test_missing_shard_becomes_fillandtest_missing_shard_fillscontinue to pass, exercising the planner's fill emission downstream of the now-tolerant prefetcher path.test_planner.cfixture rewritten: buildsarray_meta_cache+shard_index_cache+chunk_layout_cachewith a synchronous executor;mk_samplewarms one shard handle per shard the AABB touches and tracks them for cleanup.