Skip to content

Performance issues with xr.apply_ufunc() and jmd95 #9

@shanicetbailey

Description

@shanicetbailey

I was trying to compute the density equation (for sigma_2) using SOSE model data and fastjmd95, but ran into some performance issues, particularly with this line:

drhodt = xr.apply_ufunc(jmd95numba.drhodt, ds.SALT, ds.THETA, pref,
                                         output_dtypes=[ds.THETA.dtype],
                                         dask='parallelized').reset_coords(drop=True))

Workers (using 30 max) would die off for some reason. This lead to running a matrix of computations to try and isolate the underlying problem - is it just xr.apply_ufunc() having difficulty, or a combination of ufunc() and jmd95?

Please see my nb for full view of issue at hand and run times of the following options: https://nbviewer.jupyter.org/github/ocean-transport/WMT-project/blob/master/SOSE-budgets/optimization-computing-issue.ipynb

  1. xr.apply_ufunc()
  2. dsa.map_blocks()
  3. xr.map_blocks()
  4. fastjmd95
  5. dummy_function (choose simple function (.sum()) to check if fastjmd95 is also having issues)
  6. SOSE model data
  7. randomized data (to check if problem is also rooted from model data)

Run times:

  1. 4min 4s: xr.apply_ufunc()-fastjmd95-model data
  2. all tasks go to one worker and it never executes: xr.apply_ufunc()-fastjmd95-randomized data
  3. 29.7 s: xr.apply_ufunc()-sum()-model data
  4. 15.6 s: xr.apply_ufunc()-sum()-randomized data
  5. 51.2 s: dsa.map_blocks()-fastjmd95-model data
  6. 1min 53s: dsa.map_blocks()-fastjmd95-randomized data
  7. 27.7 s: dsa.map_blocks()-sum()-model data
  8. 13.3 s: dsa.map_blocks()-sum()-randomized data

Please help in trying to figure out what's going on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions