fix: OME-Zarr init_stream PermissionError on SMB/network drives#12
Draft
hinderling wants to merge 1 commit into
Draft
fix: OME-Zarr init_stream PermissionError on SMB/network drives#12hinderling wants to merge 1 commit into
hinderling wants to merge 1 commit into
Conversation
_init_stream_direct created the multi-position store with
zarr.open_group(mode="w") and then assigned root.attrs["ome"], which
rewrites zarr.json a second time via an atomic temp-file + os.replace.
On SMB/network drives that replace-over-existing intermittently fails
with PermissionError (WinError 5): the file written microseconds
earlier is still held by an SMB oplock or an AV scan. init_stream runs
outside the writer's retry loop, so the run crashed.
Bake the OME metadata into the group's creation call
(open_group(..., attributes={"ome": ...})) so zarr.json is written
exactly once -- the lone write is a rename into a non-existent path,
which does not hit the replace-over-existing failure mode.
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
OmeZarrWriter._init_stream_direct(the multi-position OME-Zarr path) builds the store like this:Assigning
root.attrsrewrites the group'szarr.jsona second time. zarr v3'sLocalStorewrites metadata atomically: it writes azarr.<uuid>.partialtemp file, thenos.replace()s it over the target.On SMB / network drives (here a Windows
Z:share) thatos.replace()over an existingzarr.json— one created microseconds earlier by theopen_groupcall — intermittently fails:The just-written
zarr.jsonis still pinned by an SMB oplock (or an AV scan) when the replace runs. The first write succeeds precisely because its target does not exist yet — that is a plain rename, not a replace-over-existing.init_streamis called once at run start (fromController._run_worker), outside the writer'swrite()retry loop — so this surfaced as a hard crash that aborted the acquisition before the first frame.Fix
Bake the OME metadata into the group's creation call so
zarr.jsonis written exactly once:With a single write, the only
os.replace()is a rename into a path that does not exist yet — which is not subject to the replace-over-existing failure mode. Verified on zarr 3.1.4:open_group(attributes=...)writes the attributes in the initial metadata write, and the subsequentcreate_array("0", ...)adds the child array without rewriting the parent group'szarr.json.Where else this pattern appears — and why the same fix does not drop in cleanly
The label-writing paths have the same multi-write shape —
_create_label_array, in bothOmeZarrWriterandOmeZarrWriterPlate:Every
attrs[...] =is another atomic replace-over-existing of that group'szarr.json, so each could in principle hit the sameWinError 5.The "bake attributes at creation" fix does not transfer cleanly here:
labels_grpis read-modify-write. It reads the existingomeattrs, appends the new label name, and writes back — the labels container accumulates label names across calls. The final attribute set is not known at creation time, andrequire_groupopens an already-existing group, so there is no one-shot creation write to bake into.label_grpis created withrequire_group(name)— idempotent open-or-create — not a one-shotcreate_group, so again there is no single creation write to attach attributes to.So the structural single-write fix is specific to
init_stream, where the group genuinely is created fresh in one call.Why the label paths are nonetheless safe today:
_create_label_arrayis reached only via_write_label, which is dispatched bywrite()— andwrite()wraps every call in a retry loop (_WRITE_RETRY_ATTEMPTS, exponential backoff) that catchesPermissionError/OSError. AWinError 5from a label attr write is caught and retried, and_create_label_arrayis re-runnable on retry (require_groupis idempotent,create_array(overwrite=True), and the label-name merge skips names already added).init_streamwas the one metadata-writing path with no retry net — which is exactly why it crashed while the label paths do not.A reasonable follow-up (out of scope here) would be to collapse the label paths' 2-3 separate
attrs[...] =assignments into a singleupdate_attributes({...})call — fewer network round-trips, fewer chances to flake into the retry loop — but it is lower priority since the retry already covers them.Test plan
open_group(mode="w", attributes={"ome": ...})writes attrs in the creation write;create_arraydoes not rewrite the parentzarr.json.PermissionError.