Skip to content

Commit 1a97fee

Browse files
authored
Many doc improvements (#60)
1 parent 74cc5eb commit 1a97fee

File tree

12 files changed

+118
-222
lines changed

12 files changed

+118
-222
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
[![GitHub Workflow CI Status](https://img.shields.io/github/workflow/status/dcherian/flox/CI?logo=github&style=for-the-badge)](https://github.com/dcherian/flox/actions)[![GitHub Workflow Code Style Status](https://img.shields.io/github/workflow/status/dcherian/flox/code-style?label=Code%20Style&style=for-the-badge)](https://github.com/dcherian/flox/actions)[![image](https://img.shields.io/codecov/c/github/dcherian/flox.svg?style=for-the-badge)](https://codecov.io/gh/dcherian/flox)[![PyPI](https://img.shields.io/pypi/v/flox.svg?style=for-the-badge)](https://pypi.org/project/flox/)[![Conda-forge](https://img.shields.io/conda/vn/conda-forge/flox.svg?style=for-the-badge)](https://anaconda.org/conda-forge/flox)[![Documentation Status](https://readthedocs.org/projects/flox/badge/?version=latest)](https://flox.readthedocs.io/en/latest/?badge=latest)
1+
[![GitHub Workflow CI Status](https://img.shields.io/github/workflow/status/dcherian/flox/CI?logo=github&style=flat)](https://github.com/dcherian/flox/actions)[![GitHub Workflow Code Style Status](https://img.shields.io/github/workflow/status/dcherian/flox/code-style?label=Code%20Style&style=flat)](https://github.com/dcherian/flox/actions)[![image](https://img.shields.io/codecov/c/github/dcherian/flox.svg?style=flat)](https://codecov.io/gh/dcherian/flox)[![PyPI](https://img.shields.io/pypi/v/flox.svg?style=flat)](https://pypi.org/project/flox/)[![Conda-forge](https://img.shields.io/conda/vn/conda-forge/flox.svg?style=flat)](https://anaconda.org/conda-forge/flox)[![Documentation Status](https://readthedocs.org/projects/flox/badge/?version=latest)](https://flox.readthedocs.io/en/latest/?badge=latest)
22

33
# flox
44

ci/docs.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,6 @@ dependencies:
1010
- toolz
1111
- myst-parser
1212
- sphinx
13-
- sphinx-book-theme
13+
- furo
1414
- pip:
1515
- git+https://github.com/dcherian/flox

docs/source/api.rst

+4-3
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Functions
99
.. autosummary::
1010
:toctree: generated/
1111

12-
core.groupby_reduce
12+
~core.groupby_reduce
1313
xarray.xarray_reduce
1414

1515
Rechunking
@@ -18,8 +18,8 @@ Rechunking
1818
.. autosummary::
1919
:toctree: generated/
2020

21-
core.rechunk_for_blockwise
22-
core.rechunk_for_cohorts
21+
~core.rechunk_for_blockwise
22+
~core.rechunk_for_cohorts
2323
xarray.rechunk_for_blockwise
2424
xarray.rechunk_for_cohorts
2525

@@ -31,6 +31,7 @@ Visualization
3131

3232
visualize.draw_mesh
3333
visualize.visualize_groups
34+
visualize.visualize_cohorts_2d
3435

3536
Aggregation Objects
3637
~~~~~~~~~~~~~~~~~~~

docs/source/conf.py

+7-132
Original file line numberDiff line numberDiff line change
@@ -28,58 +28,31 @@
2828

2929

3030
# -- General configuration -----------------------------------------------------
31-
32-
# If your documentation needs a minimal Sphinx version, state it here.
33-
# needs_sphinx = '1.0'
34-
35-
# Add any Sphinx extension module names here, as strings. They can be extensions
36-
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
3731
extensions = [
32+
"myst_parser",
3833
"sphinx.ext.autodoc",
3934
"sphinx.ext.viewcode",
4035
"sphinx.ext.autosummary",
41-
"sphinx.ext.doctest",
4236
"sphinx.ext.intersphinx",
4337
"sphinx.ext.extlinks",
4438
"numpydoc",
4539
"sphinx.ext.napoleon",
46-
"myst_parser",
47-
# "IPython.sphinxext.ipython_console_highlighting",
48-
# "IPython.sphinxext.ipython_directive",
49-
# "nbsphinx",
5040
]
5141

5242
extlinks = {
5343
"issue": ("https://github.com/dcherian/flox/issues/%s", "GH#"),
5444
"pr": ("https://github.com/dcherian/flox/pull/%s", "GH#"),
5545
}
5646

57-
# Add any paths that contain templates here, relative to this directory.
5847
templates_path = ["_templates"]
59-
60-
# The suffix of source filenames.
61-
source_suffix = ".rst"
62-
63-
# Enable notebook execution
64-
# https://nbsphinx.readthedocs.io/en/0.4.2/never-execute.html
65-
# nbsphinx_execute = 'auto'
66-
# Allow errors in all notebooks by
67-
nbsphinx_allow_errors = True
68-
69-
# Disable cell timeout
70-
nbsphinx_timeout = -1
71-
72-
73-
# The encoding of source files.
74-
# source_encoding = 'utf-8-sig'
75-
76-
# The master toctree document.
48+
source_suffix = [".rst", ".md"]
7749
master_doc = "index"
50+
language = "en"
7851

7952
# General information about the project.
8053
project = "flox"
8154
current_year = datetime.datetime.now().year
82-
copyright = f"2020-{current_year}, Deepak Cherian"
55+
copyright = f"2021-{current_year}, Deepak Cherian"
8356
author = "Deepak Cherian"
8457
# The version info for the project you're documenting, acts as replacement for
8558
# |version| and |release|, also used in various other places throughout the
@@ -90,10 +63,6 @@
9063
# The full version, including alpha/beta/rc tags.
9164
release = flox.__version__
9265

93-
# The language for content autogenerated by Sphinx. Refer to documentation
94-
# for a list of supported languages.
95-
# language = None
96-
9766
# There are two options for replacing |today|: either, you set today to some
9867
# non-false value, then it is used:
9968
# today = ''
@@ -121,18 +90,10 @@
12190
# The name of the Pygments (syntax highlighting) style to use.
12291
pygments_style = "sphinx"
12392

124-
# A list of ignored prefixes for module index sorting.
125-
# modindex_common_prefix = []
126-
127-
# If true, keep warnings as "system message" paragraphs in the built documents.
128-
# keep_warnings = False
129-
13093

13194
# -- Options for HTML output ---------------------------------------------------
13295

133-
# The theme to use for HTML and HTML Help pages. See the documentation for
134-
# a list of builtin themes.
135-
html_theme = "sphinx_book_theme"
96+
html_theme = "furo"
13697

13798
# Theme options are theme-specific and customize the look and feel of a theme
13899
# further. For a list of options available for each theme, see the
@@ -142,12 +103,7 @@
142103
# Add any paths that contain custom themes here, relative to this directory.
143104
# html_theme_path = []
144105

145-
# The name for this set of Sphinx documents. If None, it defaults to
146-
# "<project> v<release> documentation".
147-
# html_title = None
148-
149-
# A shorter title for the navigation bar. Default is the same as html_title.
150-
# html_short_title = None
106+
html_title = "flox"
151107

152108
# The name of an image file (relative to this directory) to place at the top
153109
# of the sidebar.
@@ -161,7 +117,7 @@
161117
# Add any paths that contain custom static files (such as style sheets) here,
162118
# relative to this directory. They are copied after the builtin static files,
163119
# so a file named "default.css" will overwrite the builtin "default.css".
164-
html_static_path = ["_static"]
120+
# html_static_path = ["_static"]
165121

166122
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
167123
# using the given strftime format.
@@ -201,89 +157,9 @@
201157
# base URL from which the finished HTML is served.
202158
# html_use_opensearch = ''
203159

204-
# This is the file name suffix for HTML files (e.g. ".xhtml").
205-
# html_file_suffix = None
206-
207160
# Output file base name for HTML help builder.
208161
htmlhelp_basename = "floxdoc"
209162

210-
211-
# -- Options for LaTeX output --------------------------------------------------
212-
213-
latex_elements = {
214-
# The paper size ('letterpaper' or 'a4paper').
215-
# 'papersize': 'letterpaper',
216-
# The font size ('10pt', '11pt' or '12pt').
217-
# 'pointsize': '10pt',
218-
# Additional stuff for the LaTeX preamble.
219-
# 'preamble': '',
220-
}
221-
222-
# Grouping the document tree into LaTeX files. List of tuples
223-
# (source start file, target name, title, author, documentclass [howto/manual]).
224-
latex_documents = [
225-
("index", "flox.tex", "flox Documentation", "Deepak Cherian", "manual"),
226-
]
227-
228-
# The name of an image file (relative to this directory) to place at the top of
229-
# the title page.
230-
# latex_logo = None
231-
232-
# For "manual" documents, if this is true, then toplevel headings are parts,
233-
# not chapters.
234-
# latex_use_parts = False
235-
236-
# If true, show page references after internal links.
237-
# latex_show_pagerefs = False
238-
239-
# If true, show URL addresses after external links.
240-
# latex_show_urls = False
241-
242-
# Documents to append as an appendix to all manuals.
243-
# latex_appendices = []
244-
245-
# If false, no module index is generated.
246-
# latex_domain_indices = True
247-
248-
249-
# -- Options for manual page output --------------------------------------------
250-
251-
# One entry per manual page. List of tuples
252-
# (source start file, name, description, authors, manual section).
253-
man_pages = [("index", "flox", "flox Documentation", [author], 1)]
254-
255-
# If true, show URL addresses after external links.
256-
# man_show_urls = False
257-
258-
259-
# -- Options for Texinfo output ------------------------------------------------
260-
261-
# Grouping the document tree into Texinfo files. List of tuples
262-
# (source start file, target name, title, author,
263-
# dir menu entry, description, category)
264-
texinfo_documents = [
265-
(
266-
"index",
267-
"flox",
268-
"flox Documentation",
269-
author,
270-
"flox",
271-
"One line description of project.",
272-
"Miscellaneous",
273-
),
274-
]
275-
276-
# Documents to append as an appendix to all manuals.
277-
# texinfo_appendices = []
278-
279-
# If false, no module index is generated.
280-
# texinfo_domain_indices = True
281-
282-
# How to display URL addresses: 'footnote', 'no', or 'inline'.
283-
# texinfo_show_urls = 'footnote'
284-
285-
# If true, do not generate a @detailmenu in the "Top" node's menu.
286-
# texinfo_no_detailmenu = False
287163
intersphinx_mapping = {
288164
"python": ("https://docs.python.org/3/", None),
289165
"pandas": ("https://pandas.pydata.org/pandas-docs/stable", None),
@@ -302,7 +178,6 @@
302178
napoleon_use_param = False
303179
napoleon_use_rtype = False
304180
napoleon_preprocess_types = True
305-
306181
napoleon_type_aliases = {
307182
# general terms
308183
"sequence": ":term:`sequence`",

docs/source/custom.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
## Custom reductions
1+
# Custom reductions
22

33
`flox` implements all common reductions provided by `numpy_groupies` in `aggregations.py`.
44
It also allows you to specify a custom Aggregation (again inspired by dask.dataframe),
55
though this might not be fully functional at the moment. See `aggregations.py` for examples.
66

7-
``` python
7+
```python
88
mean = Aggregation(
99
# name used for dask tasks
1010
name="mean",

docs/source/implementation.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Algorithms
1+
# Algorithms
22

33
`flox` outsources the core GroupBy operation to the vectorized implementations in
44
[numpy_groupies](https://github.com/ml31415/numpy-groupies). Constructing
@@ -13,7 +13,7 @@ or `xarray_reduce`.
1313

1414
First we describe xarray's current strategy
1515

16-
### `method="split-reduce"`: Xarray's current GroupBy strategy
16+
## `method="split-reduce"`: Xarray's current GroupBy strategy
1717

1818
Xarray's current strategy is to find all unique group labels, index out each group,
1919
and then apply the reduction operation. Note that this only works if we know the group
@@ -28,11 +28,11 @@ communication and is quite expensive (in dataframe terminology, this is a "shuff
2828
This is fundamentally why many groupby reductions don't work well right now with
2929
big datasets.
3030

31-
### `method="map-reduce"`
31+
## `method="map-reduce"`
3232

3333
The first idea is to use the "map-reduce" strategy (inspired by `dask.dataframe`).
3434

35-
![map-reduce-strategy-schematic](/../diagrams/mapreduce.png)
35+
![map-reduce-strategy-schematic](/../diagrams/map-reduce.png)
3636

3737
The GroupBy reduction is first applied blockwise. Those intermediate results are
3838
combined by concatenating to form a new array which is then reduced
@@ -47,7 +47,7 @@ till all group results are in one block. At that point the result is
4747
reduction at the first combine step is effective. "effective" means we actually
4848
reduce values and release some memory.
4949

50-
### `method="blockwise"`
50+
## `method="blockwise"`
5151

5252
One case where `"map-reduce"` doesn't work well is the case of "resampling" reductions. An
5353
example here is resampling from daily frequency to monthly frequency data: `da.resample(time="M").mean()`
@@ -70,7 +70,7 @@ so that all members of a group are in a single block. Then, the groupby operatio
7070
1. Works better when multiple groups are already in a single block; so that the intial
7171
rechunking only involves a small amount of communication.
7272

73-
### `method="cohorts"`
73+
## `method="cohorts"`
7474

7575
We can combine all of the above ideas for cases where members from different groups tend to occur close to each other.
7676
One example is the construction of "climatologies" which is a climate science term for something like `groupby("time.month")`

docs/source/overview.md renamed to docs/source/index.md

+17-11
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,12 @@
1-
# flox: An Overview
2-
3-
Fast & furious GroupBy reductions for `dask.array`
1+
# flox: fast & furious GroupBy reductions for `dask.array`
42

53
## Overview
64

7-
[![GitHub Workflow CI Status](https://img.shields.io/github/workflow/status/dcherian/flox/CI?logo=github&style=for-the-badge)](https://github.com/dcherian/flox/actions)
8-
[![GitHub Workflow Code Style Status](https://img.shields.io/github/workflow/status/dcherian/flox/code-style?label=Code%20Style&style=for-the-badge)](https://github.com/dcherian/flox/actions)
9-
[![image](https://img.shields.io/codecov/c/github/dcherian/flox.svg?style=for-the-badge)](https://codecov.io/gh/dcherian/flox)
10-
[![PyPI](https://img.shields.io/pypi/v/flox.svg?style=for-the-badge)](https://pypi.org/project/flox/)
11-
[![Conda-forge](https://img.shields.io/conda/vn/conda-forge/flox.svg?style=for-the-badge)](https://anaconda.org/conda-forge/flox)
5+
[![GitHub Workflow CI Status](https://img.shields.io/github/workflow/status/dcherian/flox/CI?logo=github&style=flat)](https://github.com/dcherian/flox/actions)
6+
[![GitHub Workflow Code Style Status](https://img.shields.io/github/workflow/status/dcherian/flox/code-style?label=Code%20Style&style=flat)](https://github.com/dcherian/flox/actions)
7+
[![image](https://img.shields.io/codecov/c/github/dcherian/flox.svg?style=flat)](https://codecov.io/gh/dcherian/flox)
8+
[![PyPI](https://img.shields.io/pypi/v/flox.svg?style=flat)](https://pypi.org/project/flox/)
9+
[![Conda-forge](https://img.shields.io/conda/vn/conda-forge/flox.svg?style=flat)](https://anaconda.org/conda-forge/flox)
1210

1311
This project explores strategies for fast GroupBy reductions with dask.array. It used to be called `dask_groupby`. It was motivated by
1412

@@ -17,9 +15,7 @@ This project explores strategies for fast GroupBy reductions with dask.array. It
1715
2. numpy_groupies in Xarray
1816
[issue](https://github.com/pydata/xarray/issues/4473)
1917

20-
(See a
21-
[presentation](https://docs.google.com/presentation/d/1YubKrwu9zPHC_CzVBhvORuQBW-z148BvX3Ne8XcvWsQ/edit?usp=sharing)
22-
about this package, from the Pangeo Showcase).
18+
See a presentation ([video](https://discourse.pangeo.io/t/november-17-2021-flox-fast-furious-groupby-reductions-with-dask-at-pangeo-scale/2016), [slides](https://docs.google.com/presentation/d/1YubKrwu9zPHC_CzVBhvORuQBW-z148BvX3Ne8XcvWsQ/edit?usp=sharing)) about this package, from the Pangeo Showcase.
2319

2420
## Installing
2521

@@ -45,3 +41,13 @@ There are two main functions
4541
This work was funded in part by NASA-ACCESS 80NSSC18M0156 "Community tools for analysis of NASA Earth Observing System
4642
Data in the Cloud" (PI J. Hamman), and [NCAR's Earth System Data Science Initiative](https://ncar.github.io/esds/).
4743
It was motivated by many discussions in the [Pangeo](https://pangeo.io) community.
44+
45+
## Contents
46+
```{eval-rst}
47+
.. toctree::
48+
:maxdepth: 1
49+
50+
implementation.md
51+
custom.md
52+
api.rst
53+
```

docs/source/index.rst

-10
This file was deleted.

flox/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/usr/bin/env python
22
# flake8: noqa
33
"""Top-level module for flox ."""
4-
from .core import groupby_reduce # noqa
4+
from .core import groupby_reduce, rechunk_for_blockwise, rechunk_for_cohorts # noqa
55

66
try:
77
from importlib.metadata import version as _version

0 commit comments

Comments
 (0)