v0.7.0
This is the release notes of v0.7.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.7.0rc2; for all highlights and changes, please refer to the release notes of the pre-releases:
alpha1
alpha2
alpha3
alpha4
alpha5
alpha6
alpha7
alpha8
beta1
beta2
rc1
rc2
Changes that break compatibility
v0.7.0 has unified local and distributed execution layer, local thread-based scheduling has been removed, instead, the unified runtime is based on multiprocess-based scheduling which could get rid of infamous GIL problem .
Thus, for local usage, please new a local default session via:
import mars
mars.new_session() # create a default local session
If not doing so, it will be initialized once in the background, however, keep in mind that the initialization of multiprocess scheduling consumes more time compared to multithread one.
We tried our best to keep other compatibilities, if you find any incompatible place, please open an issue to reach out to us.
Highlights
v0.7.0 implements a unified execution layer, all deployment including bare metal, Kubernetes, Ray as well as Yarn shares the same fundamental components. This unified execution layer optimized many aspects compare to the old one including:
- Better serialization based on pickle5 protocol, which is 5-7x faster than old version.
- Completely rewritten execution layer which has better performance, even 20%-50% faster than the old version on a laptop.
- Based on multiprocess scheduling which avoids infamous GIL issue.
- Mars on Ray is way more better due to the reason that Ray actor is leveraged to build the Ray backend of Oscar which is a lightweight actor framework that is the fundamental part of the entire execution layer.
- GPU can be supported more better with the new architecture.
New Features
- Tensor
- Learn
- Implements mars.learn.utils.multiclass.unique_label (#2300)
- Services
- Add get_storage_level_info api (#2242)
- Add API to fetch tileable graph as JSON (#2271, thanks @RandomY-2!)
- Enable running on GPU for oscar (#2306)
- Others
- Add support for seek method in memory cases (#2264)
Enhancements
- Add support for stateless actors (#2220)
- Add status filters for Cluster service (#2221)
- Pass logging config file name into sub pools (#2225)
- Support choosing aggregation algorithm at runtime (#2226)
- Add method to session to get web endpoint (#2238)
- Use Kubernetes Service to discover Mars Supervisors (#2240)
- Ensure range index incremental for data source op like
md.read_csv
(#2244) - Record mapper meta for shuffle task (#2255)
- Support data dependency for run_script (#2256)
- Refine oscar debugging (#2261)
- Support fetch_log for web session (#2262)
- Allow turning off actor killing (#2277)
- Use batch method to reduce transferring cost for shuffle tasks (#2279)
- Assign bands given devices of subtasks (#2278)
- Add bind method to facilitate extracting batch args (#2281)
- Reduce memory estimation for specific operands (#2285)
Bug fixes
- Fix NoDataToSpill when multiple storage quota requests happen simultaneously (#2223)
- Stop using thread local to store default session (#2243)
- Fix service errors in Windows (#2247)
Documentation
Installation
- Merge default & distributed requirements (#2270)
Tests
- Add separate check pipeline (#2302)