-
Notifications
You must be signed in to change notification settings - Fork 270
Description
Summary
Opening a NetCDF file with mode="w" that is locked by another process:
- Raises
PermissionError(expected) - Simultaneously truncates the target file to 0 bytes (data corruption!)
I.e., data is permanently lost when trying to opening a locked file, even though the operation "failed."
Expected Behavior
nc.Dataset(filename, mode="w") should either:
- Succeed completely and overwrite the file, OR
- Fail cleanly without modifying the existing file
Reproduction Steps
The bug can be reproduced when working on the same dataset from multiple sessions. E.g. with two Jupyter notebooks. (A and B in the test below)
# Setup: Jupyter notebook A
import netCDF4 as nc
import numpy as np
import os
# Create initial test file
ds = nc.Dataset("test.nc", "w")
ds.createDimension("x", 10)
var = ds.createVariable("data", "f8", ("x",))
var[:] = np.random.rand(10)
ds.close()
print(f"Original file size: {os.path.getsize('test.nc')} bytes")Output:
Original file size: 6224 bytes
Reproduce the bug:
# Step 1: Jupyter Notebook B: Open file in read mode (simulates file lock from another process)
ds_read = nc.Dataset("test.nc", "r") # Keep this open
# Step 2: BACK TO A: Try to overwrite from different process/session
try:
ds_write = nc.Dataset("test.nc", "w") # This should fail cleanly
except PermissionError as e:
print(f"Got expected error: {e}")
print(f"File size after error: {os.path.getsize('test.nc')} bytes") # Shows 0!The output shows indeed 0 bytes. The file has been truncated!
Got expected error: [Errno 13] Permission denied: 'test_ds.nc'
File size after error: 0 bytes
# Step 3: (Optional, the bug already happened) in B, close the read handle
ds_read.close()
# Step 4: Try to read the original file
try:
damaged = nc.Dataset("test.nc", "r")
except Exception as e:
print(f"{e}")
Output:
[Errno -51] NetCDF: Unknown file format
Impact
This bug makes netcdf4-python unsafe for concurrent access scenarios and multi-user environments. In my case, this was a jupyter notebook workflow where files may be accessed from multiple kernels.
I think this is a nasty bug as users lose their work even though the operation reported failure. I understand that locked file cannot be opened, obviously, but this should not erase the locked file (by the way why does this happen?).
Notice also that the message from Errno -51 can be (is) confusing: it suggests problems with the the file format, where people think "what can be wrong with the format? I'm working with netCDF as usual..." (one conflates the format with the file extension).
The actual problem is that there is nothing in the file, but this is caused by unforseen file truncation... adding to pre-existing confusion.
Environment Details
OS: Windows 11
Python: 3.13.2
netcdf4-python version: 1.7.2
NetCDF C library version: 4.9.2
Context: Occurs in Jupyter notebook environments with concurrent access. Not tested assumption this might happen for concurrent accesses in general.
History
This was originally reported in xarray (pydata/xarray#10679) but traced back to netcdf4-python as the root cause.
Suggested Fix
The file should not be modified if one does not have write access. Things that come to mind:
- Check file permissions before truncation
- Use atomic write operations (write to temp file, then rename)
- Fail immediately on permission errors without modifying the target file
(Bonus) Final comment/Question
It is unclear to me why the file is truncated even though it cannot be accessed (we do get a PermissionError!). Why does this happen?