df = pd.DataFrame({
'latitude': [51.5072, 48.8566, 40.7128, 35.6762],
'longitude': [0.1276, 2.3522, 74.0060, 139.6503],
},
index = [1, 2, 3, 4],
)
df.index.name = 'points'
ds_pts = df.to_xarray()
ptime = xr.DataArray(['2023-08-09'], dims=['time']) # note, this is probably not best way to do the time indexing since time will be in points dataframe for points
da.sel(time=ptime, latitude=ds_pts.latitude, longitude=ds_pts.longitude, method='nearest')
Summary
Implement a new batch extraction method
_extract_axis_batch()(or similar) forspatial_method="axis", replacing the per-point selection currently used forspatial_method="nearest"with fully vectorized indexing using xarray's.sel(..., method="nearest")on all points at once.spatial_method="axis": Vectorized, 1-D lat/lon coordinate selection (xarray native axis-based nearest; works on regular grids) replacesspatial_method="nearest"spatial_method="auto": Now resolves as follows:axiseuclideanDetailed requirements
Vectorized extraction for
axis:spatial_method="axis"and a granule has one or more points, batch all points into a single.sel()call for all variables.coord_spec.spatial_method="nearest"references:spatial_method="axis"should always use the new batch/vectorized approach.spatial_method="nearest"and replace withspatial_method="axis".Additional axes:
coord_spec(e.g., depth, wavelength), support as additional vectorized indexers.autologic update:spatial_method="auto", the engine should now check:lat.ndim == 1andlon.ndim == 1in the dataset, useaxis(vectorized via 1-D axis matching)Quality and compatibility:
Example of vectorized indexing