Skip to content

Latest commit

 

History

History
18 lines (14 loc) · 749 Bytes

File metadata and controls

18 lines (14 loc) · 749 Bytes

Advanced Computing Topics

Dask - Parallel Computing

Dask is a Python library for parallel and distributed computing. Dask is easy to use and set up, providing powerful computing options at scale while unlocking complex algorithms.

Dask makes accessess parallel processing for variety of common data applications easy, such as:

  • Multithreaded processing of large arrays and tabular data frames
  • Embarassingly parallel execution of separate jobs
  • Parameter sweeps of machine learning processing

Dask can be run as either as part of scripts or as a backend for a Jupyter-Notebook. In this context, the notebook is able to take advantage of the parallel processing accelartion Dask can provide from a common and simple to deploy tool.