Skip to content

v0.8.0

Compare
Choose a tag to compare
@dcherian dcherian released this 15 Oct 04:51
· 136 commits to main since this release
fecd9a6

What's Changed

Major performance improvements!!!

  1. Support numbagg throughengine="numbagg" for many common nan-skipping reductions in #72. Using numbagg appears to be a major speedup (2x-3x in general, 6X for nanmean). Special thanks to @max-sixty for major work on numbagg's grouped aggregations! Here are timings for reducing a 2D array along the last axis with ordered group labels.

    func engine
    nansum flox 70.3±0.2ms
    numpy 122±0.2ms
    numbagg 18.4±0.04ms
    nanmean flox 144±0.4ms
    numpy 196±0.5ms
    numbagg 23.7±0.2ms
    nanmax flox 93.4±0.8ms
    numpy 953±2ms
    numbagg 20.3±0.2ms
    count flox 59.8±1ms
    numpy 114±0.2ms
    numbagg 29.3±0.1ms
  2. Support engine=None in #266. This will

    • Use numbagg if available
    • If not, use flox if the group labels are sorted
    • Fallback to numpy otherwise.
      Thanks to @mathause for kicking off this work.
  3. Significant speed to detecting "cohorts" of groups in #272

Other Major Changes

  1. Test and support for python 3.12 (note numba does not support 3.12 yet)
  2. Bump minimum numpy version to 1.22.
  3. New Aggregations : Support quantile, median, mode with method="blockwise". by @dcherian in #269
  4. Add multidimensional bins demo notebook by @dcherian in #203 . This is useful for prediction/forecasting problems.

Minor Changes

New Contributors

Full Changelog: v0.7.2...v0.8.0