Open
Description
https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html is used to aggregate the N-dimensional data. The current api for fss_2d
and fss_2d_binary
accept a similar argument dask # default='forbidden'
.
However, this is not tested on dask
arrays. Note that the underlying operations are performed on numpy
arrays so only "forbidden" or "parallelize" are the only valid options.
Note
- "forbidden" option will raise an error for dask arrays when it hits the
ufunc
. To circumvent this we should coerce to corexarray
prior to any computations, if the user setsdask = "forbidden"
but feeds in a dask array. dask
arrays can be coerced to corexarray
ornumpy
using.compute()
.load()
or equivalent - so settingdask = "forbidden"
shouldn't halt the computations, they just cannot utilize dask's scheduling and memory management.
Test cases:
- [benchmark] test against chunked large dask arrays > RAM. "forbidden" should most likely crash (though this is non-deterministic), while "parallelize" (dask) should hopefully still work.
- [benchmark] test speed of "forbidden" (core-numpy) v.s. "parallelize" (dask) options.
- [unit test] test that output arrays are equivilent for both options
- [unit test] "allowed" option should raise an error (
ufunc
doing the aggregation expects anumpy
array)