Can't call open_mfdataset without creating chunked dask arrays #9038
Open
Description
What happened?
Passing chunks=None
to xr.open_dataset
/open_mfdataset
is supposed to avoid using dask at all, returning lazily-indexed numpy arrays even if dask is installed. However chunks=None
doesn't currently work for xr.open_mfdataset
as it gets silently coerced internally to chunks={}
, which creates dask chunks aligned with the on-disk files.
Offending line of code:
Line 1040 in 12123be
What did you expect to happen?
Passing chunks=None
to open_mfdataset
should return lazily-indexed numpy arrays, like open_dataset
does.
Minimal Complete Verifiable Example
ds = xr.tutorial.open_dataset("air_temperature")
ds1 = ds.isel(time=slice(None, 1000))
ds2 = ds.isel(time=slice(1000, None))
ds1.to_netcdf('air1.nc')
ds2.to_netcdf('air2.nc')
combined = xr.open_mfdataset(['air1.nc', 'air2.nc'], chunks=None)
print(type(combined['air'].data))
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
dask.array.core.Array
Anything else we need to know?
As the default is None
, changing this without changing the default would be a breaking change. But the current behaviour is also not intended.
Environment
main