This is the release notes of v0.7.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.7.0rc2; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1
alpha2
alpha3
alpha4
alpha5
alpha6
alpha7
alpha8
beta1
beta2
rc1
rc2

Changes that break compatibility

v0.7.0 has unified local and distributed execution layer, local thread-based scheduling has been removed, instead, the unified runtime is based on multiprocess-based scheduling which could get rid of infamous GIL problem .

Thus, for local usage, please new a local default session via:

import mars

mars.new_session()  # create a default local session

If not doing so, it will be initialized once in the background, however, keep in mind that the initialization of multiprocess scheduling consumes more time compared to multithread one.

We tried our best to keep other compatibilities, if you find any incompatible place, please open an issue to reach out to us.

Highlights

v0.7.0 implements a unified execution layer, all deployment including bare metal, Kubernetes, Ray as well as Yarn shares the same fundamental components. This unified execution layer optimized many aspects compare to the old one including:

Better serialization based on pickle5 protocol, which is 5-7x faster than old version.
Completely rewritten execution layer which has better performance, even 20%-50% faster than the old version on a laptop.
Based on multiprocess scheduling which avoids infamous GIL issue.
Mars on Ray is way more better due to the reason that Ray actor is leveraged to build the Ray backend of Oscar which is a lightweight actor framework that is the fundamental part of the entire execution layer.
GPU can be supported more better with the new architecture.

New Features

Tensor
- Add partial support of bessel functions (#2274, thanks @JuntaoMa!)
- Implements mars.tensor.in1d (#2301)
Learn
- Implements mars.learn.utils.multiclass.unique_label (#2300)
Services
- Add get_storage_level_info api (#2242)
- Add API to fetch tileable graph as JSON (#2271, thanks @RandomY-2!)
- Enable running on GPU for oscar (#2306)
Others
- Add support for seek method in memory cases (#2264)

Enhancements

Add support for stateless actors (#2220)
Add status filters for Cluster service (#2221)
Pass logging config file name into sub pools (#2225)
Support choosing aggregation algorithm at runtime (#2226)
Add method to session to get web endpoint (#2238)
Use Kubernetes Service to discover Mars Supervisors (#2240)
Ensure range index incremental for data source op like md.read_csv (#2244)
Record mapper meta for shuffle task (#2255)
Support data dependency for run_script (#2256)
Refine oscar debugging (#2261)
Support fetch_log for web session (#2262)
Allow turning off actor killing (#2277)
Use batch method to reduce transferring cost for shuffle tasks (#2279)
Assign bands given devices of subtasks (#2278)
Add bind method to facilitate extracting batch args (#2281)
Reduce memory estimation for specific operands (#2285)

Bug fixes

Fix NoDataToSpill when multiple storage quota requests happen simultaneously (#2223)
Stop using thread local to store default session (#2243)
Fix service errors in Windows (#2247)

Documentation

Doc refinement for Oscar (#2291)
Add docs for batch methods (#2298)

Installation

Merge default & distributed requirements (#2270)

Tests

Add separate check pipeline (#2302)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.7.0