Skip to content

Commit 6897fdd

Browse files
committed
Add doc
1 parent d1b60b3 commit 6897fdd

File tree

1 file changed

+77
-2
lines changed

1 file changed

+77
-2
lines changed

doc/source/analyzing/parallel_computation.rst

Lines changed: 77 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Currently, yt is able to perform the following actions in parallel:
2424
* Halo analysis (:ref:`halo-analysis`)
2525
* Volume rendering (:ref:`volume_rendering`)
2626
* Isocontours & flux calculations (:ref:`extracting-isocontour-information`)
27+
* Parallelization over data containers (:ref:`_Data-objects`)
2728

2829
This list covers just about every action yt can take! Additionally, almost all
2930
scripts will benefit from parallelization with minimal modification. The goal
@@ -219,6 +220,7 @@ The following operations use chunk decomposition:
219220
* Derived Quantities (see :ref:`derived-quantities`)
220221
* 1-, 2-, and 3-D profiles (see :ref:`generating-profiles-and-histograms`)
221222
* Isocontours & flux calculations (see :ref:`surfaces`)
223+
* Parallelization over data containers using ``piter`` (see :ref:`_Data-objects`)
222224

223225
Parallelization over Multiple Objects and Datasets
224226
++++++++++++++++++++++++++++++++++++++++++++++++++
@@ -229,8 +231,81 @@ independently to several different objects or datasets, a so-called
229231
task, yt can do that easily. See the sections below on
230232
:ref:`parallelizing-your-analysis` and :ref:`parallel-time-series-analysis`.
231233

232-
Use of ``piter()``
233-
^^^^^^^^^^^^^^^^^^
234+
Use of ``piter()`` on a data container
235+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
236+
237+
You can parallelize I/O on any data container using the
238+
:func:`~yt.data_objects.selection_objects.data_selection_objects.YTSelectionContainer.piter` function.
239+
It dispatches all the chunks that make up the data container to different processors, and gives you the ability to store results from each chunk.
240+
241+
Take the following example
242+
243+
.. code-block:: python
244+
245+
import yt
246+
247+
yt.enable_parallelism()
248+
249+
ds = yt.load("output_00080")
250+
ad = ds.all_data()
251+
252+
# Each processor reads a different chunk
253+
my_storage = {}
254+
for sto, chunk in ad.piter(storage=my_storage):
255+
sto.result = {}
256+
sto.result["gas", "density"] = chunk["gas", "density"]
257+
258+
my_storage # now contains one entry per chunk
259+
my_storage[0]["gas", "density"] # contains the density for chunk 0
260+
...
261+
my_storage[15]["gas", "density"] # contains the density for chunk 15 (out of 15)
262+
263+
264+
Sometimes, it is also useful to combine the results. This is implemented using an optional ``reduction`` keyword argument to the
265+
:func:`~yt.data_objects.selection_objects.data_selection_objects.YTSelectionContainer.piter` function.
266+
The following reductions are implemented:
267+
- ``None`` (default): no reduction, ``my_storage`` will contain one entry per chunk
268+
- ``"cat"``: concatenate the results from all chunks, ``my_storage`` will contain a flattened list of all results
269+
- ``"cat_on_root"``: concatenate the results from all chunks, but only on the root process. The other processes will have ``None`` in ``my_storage``
270+
- ``"sum"``, ``"min"`` or ``"max"``: reduce the results from all chunks using the specified operation, ``my_storage`` will contain the reduced result on all processes
271+
272+
.. code-block:: python
273+
274+
# Gathering everything on root
275+
my_storage = {}
276+
for sto, chunk in ad.piter(storage=my_storage, reduction="cat_on_root"):
277+
sto.result = {}
278+
sto.result["gas", "density"] = chunk["gas", "density"]
279+
280+
# On root:
281+
if yt.is_root():
282+
my_storage["gas", "density"] == ad["gas", "density"] # True for all entries
283+
else:
284+
my_storage["gas", "density"] is None
285+
286+
# Gather everything on all processes
287+
my_storage = {}
288+
for sto, chunk in ad.piter(storage=my_storage, reduction="cat"):
289+
sto.result = {}
290+
sto.result["gas", "density"] = chunk["gas", "density"]
291+
292+
my_storage["gas", "density"] == ad[
293+
"gas", "density"
294+
] # True for all entries on all processes
295+
296+
# Summing over all chunks
297+
my_storage = {}
298+
for sto, chunk in ad.piter(storage=my_storage, reduction="sum"):
299+
sto.result = {}
300+
sto.result["gas", "total_mass"] = chunk["gas", "cell_mass"].sum()
301+
sto.result["gas", "total_volume"] = chunk["gas", "cell_volume"].sum()
302+
303+
my_storage["gas", "total_mass"] # contains the total mass
304+
my_storage["gas", "total_volume"] # contains the total volume
305+
306+
307+
Use of ``piter()`` over objects and datasets
308+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
234309

235310
If you use parallelism over objects or datasets, you will encounter
236311
the :func:`~yt.data_objects.time_series.DatasetSeries.piter` function.

0 commit comments

Comments
 (0)