File tree Expand file tree Collapse file tree 1 file changed +3
-60
lines changed
Expand file tree Collapse file tree 1 file changed +3
-60
lines changed Original file line number Diff line number Diff line change 11Dask Expressions
22================
33
4- Dask DataFrames with query optimization.
4+ The Implementation is now the default and only backend for Dask DataFrames and was
5+ moved to https://github.com/dask/dask .
56
6- This is a rewrite of Dask DataFrame that includes query
7- optimization and generally improved organization.
8-
9- More in our blog posts:
10- - [ Dask Expressions overview] ( https://blog.dask.org/2023/08/25/dask-expr-introduction )
11- - [ TPC-H benchmark results vs. Dask DataFrame] ( https://docs.coiled.io/blog/tpch.html )
12-
13- Example
14- -------
15-
16- ``` python
17- import dask_expr as dx
18-
19- df = dx.datasets.timeseries()
20- df.head()
21-
22- df.groupby(" name" ).x.mean().compute()
23- ```
24-
25- Query Representation
26- --------------------
27-
28- Dask-expr encodes user code in an expression tree:
29-
30- ``` python
31- >> > df.x.mean().pprint()
32-
33- Mean:
34- Projection: columns= ' x'
35- Timeseries: seed= 1896674884
36- ```
37-
38- This expression tree will be optimized and modified before execution:
39-
40- ``` python
41- >> > df.x.mean().optimize().pprint()
42-
43- Div:
44- Sum:
45- Fused(375f9 ):
46- | Projection: columns= ' x'
47- | Timeseries: dtypes= {' x' : < class ' float' > } seed= 1896674884
48- Count:
49- Fused(375f9 ):
50- | Projection: columns= ' x'
51- | Timeseries: dtypes= {' x' : < class ' float' > } seed= 1896674884
52- ```
53-
54- Stability
55- ---------
56-
57- This is the default backend for dask.DataFrame since version 2024.3.0.
58-
59- API Coverage
60- ------------
61-
62- Dask-Expr covers almost everything of the Dask DataFrame API. The only missing features are:
63-
64- - named GroupBy Aggregations
7+ This repository is no longer maintained.
You can’t perform that action at this time.
0 commit comments