Skip to content

FSS: test dask #505

Open
Open
@nikeethr

Description

@nikeethr

https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html is used to aggregate the N-dimensional data. The current api for fss_2d and fss_2d_binary accept a similar argument dask # default='forbidden'.

However, this is not tested on dask arrays. Note that the underlying operations are performed on numpy arrays so only "forbidden" or "parallelize" are the only valid options.

Note

  • "forbidden" option will raise an error for dask arrays when it hits the ufunc. To circumvent this we should coerce to core xarray prior to any computations, if the user sets dask = "forbidden" but feeds in a dask array.
  • dask arrays can be coerced to core xarray or numpy using .compute() .load() or equivalent - so setting dask = "forbidden" shouldn't halt the computations, they just cannot utilize dask's scheduling and memory management.

Test cases:

  1. [benchmark] test against chunked large dask arrays > RAM. "forbidden" should most likely crash (though this is non-deterministic), while "parallelize" (dask) should hopefully still work.
  2. [benchmark] test speed of "forbidden" (core-numpy) v.s. "parallelize" (dask) options.
  3. [unit test] test that output arrays are equivilent for both options
  4. [unit test] "allowed" option should raise an error (ufunc doing the aggregation expects a numpy array)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions