Allow parallel I/O for hydro and particle reading #4730

cphyc · 2023-11-03T11:12:57Z

PR Summary

This allows parallel I/O for the RAMSES dataset. Reading a RAMSES dataset has three steps:

read in the AMR structure
read the hydro files
read the particle files

The strategy I have adopted is to parallelize on the first one, such that each MPI task is now in charge of a few domains and only reads those (incl. hydro + particles).

cphyc · 2023-11-03T14:59:33Z

yt/utilities/parallel_tools/parallel_analysis_interface.py

        roff = [off * dtr for off in offsets]
        rsize = [siz * dtr for siz in sizes]
-        tmp_recv = recv.view(self.__tocast)
+        tmp_recv = recv.view(transport_dtype)


The motivation for getting rid of the cast to CHAR is so that we do not hit the limit imposed by MPI of 2**31 elements when communicating arrays as quickly.

cphyc · 2023-11-03T15:01:12Z

yt/utilities/parallel_tools/parallel_analysis_interface.py

+        self.comm.send(
+            (arr.dtype.str, arr.shape, transport_dtype) + unit_metadata,
+            dest=dest,
+            tag=tag,
+        )
+        self.comm.Send([arr, mpi_type], dest=dest, tag=tag)


Isn't there a bug here? I think this should be

Suggested change

self.comm.send(

(arr.dtype.str, arr.shape, transport_dtype) + unit_metadata,

dest=dest,

tag=tag,

)

self.comm.Send([arr, mpi_type], dest=dest, tag=tag)

self.comm.send(

(arr.dtype.str, arr.shape, transport_dtype) + unit_metadata,

dest=dest,

tag=tag,

)

self.comm.Send([tmp, mpi_type], dest=dest, tag=tag)

I think it should be tmp, yeah, but since it has been working I think it's possible that mpi4py was fixing it implicitly.

cphyc · 2023-11-03T15:23:39Z

@yt-fido test this please

cphyc · 2025-07-18T13:04:38Z

After discussion with @matthewturk and @chrishavlin, we're proposing to implement this:

import numpy as np
import yt

ds = yt.load(...)
yt.enable_parallelism()
ad = ds.all_data()

def expensive_function(chunk):
    import time
    time.sleep(np.random.rand() * 3600)
    return 42

sto = {}
for chunk in ad.piter(storage=sto, reduction="min/max/sum/cat/cat_on_root"):
    sto.result["gas", "density"]     = chunk["gas", "density"]
    sto.result["gas", "temperature"] = chunk["gas", "temperature"]
    sto.result["expensive_stuff"]    = expensive_function(chunk)

if yt.is_root():
    mean_expensive_stuff = np.mean(sto["expensive_stuff"])
    plt.hist2d(sto["gas", "density"], sto["gas", "temperature"], bins=...)
    plt.title(f"{mean_expennsive_stuff=}")
    plt.savefig("...")

cphyc · 2025-07-18T16:00:45Z

Closing since #5218 is much better.

cphyc force-pushed the parallelize-ramses-io branch 2 times, most recently from ce2aa15 to a8d7767 Compare November 3, 2023 11:30

cphyc commented Nov 3, 2023

View reviewed changes

cphyc force-pushed the parallelize-ramses-io branch from 53f24bc to c9fc44d Compare November 3, 2023 15:28

cphyc mentioned this pull request Feb 19, 2024

Two-level indexing for parallelization of I/O yt-project/yt_experiments#6

Open

cphyc added 4 commits March 2, 2025 11:27

Allow parallel I/O for hydro and particle reading

f6a0895

Parallelize at the level of the domains

4104abd

Gather npart/levels

2fe564b

Transport with as close-a-type as possible

3097fde

cphyc force-pushed the parallelize-ramses-io branch from c9fc44d to 3097fde Compare March 2, 2025 10:32

cphyc added 3 commits March 2, 2025 11:38

pre-commit updates

1ae20a0

Do not assume pos is C-contiguous (it is not)

ae8071e

Use integer ratio between send/recv dtypes

7fd624f

jzuhone marked this pull request as ready for review March 2, 2025 14:02

Do not // combine

64814bf

cphyc mentioned this pull request Jul 18, 2025

Allow parallel iteration over data containers #5218

Open

2 tasks

cphyc closed this Jul 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow parallel I/O for hydro and particle reading #4730

Allow parallel I/O for hydro and particle reading #4730

Uh oh!

cphyc commented Nov 3, 2023 •

edited

Loading

Uh oh!

cphyc Nov 3, 2023

Uh oh!

cphyc Nov 3, 2023

Uh oh!

matthewturk Nov 3, 2023

Uh oh!

cphyc commented Nov 3, 2023

Uh oh!

cphyc commented Jul 18, 2025

Uh oh!

cphyc commented Jul 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Allow parallel I/O for hydro and particle reading #4730

Allow parallel I/O for hydro and particle reading #4730

Uh oh!

Conversation

cphyc commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

cphyc Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

cphyc Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

matthewturk Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

cphyc commented Nov 3, 2023

Uh oh!

cphyc commented Jul 18, 2025

Uh oh!

cphyc commented Jul 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cphyc commented Nov 3, 2023 •

edited

Loading