Added the ability to read xdmf dumps from `Idefix` by dutta-alankar · Pull Request #336 · neutrinoceros/yt_idefix

dutta-alankar · 2024-10-03T18:24:58Z

@neutrinoceros As discussed a few days ago, here is the first implementation of a working reader to parse Idefix xdmf dumps.

While developing the HDF5 IO module on Idefix, I have taken a special attention to make the files contain additional attributes that make them self-contained and hence don't need definitions.hpp or idefix.ini to process the data. I am aware that my implementation is far from being the cleanest, and simply does the job for the time being. But I believe that it is a good head start and can be trimmed and matured to a polished version. I'm open to take in comments to improve this.

To test, I'm using the following code with a KHI data dump from the following PR. I'm also sharing the data used in this test here in this link.

import yt
import yt_idefix

num = 1
ds = yt.load(f"data.{num:04d}.flt.h5")

# Create gas density slices in all three axes.
p = yt.SlicePlot(ds, "z", ("gas", "density"))
p.set_cmap(field=("gas", "density"), cmap="viridis")
p.save()

which gives the following output:

I compared this with the following ParaView output:

which seems to agree well with each other.

The yt code output is the following

yt : [INFO     ] 2024-10-04 11:42:12,973 Parameters: current_time              = 0.01003431438520486
yt : [INFO     ] 2024-10-04 11:42:12,974 Parameters: domain_dimensions         = [1024  256    1]
yt : [INFO     ] 2024-10-04 11:42:12,974 Parameters: domain_left_edge          = [0. 0. 0.]
yt : [INFO     ] 2024-10-04 11:42:12,975 Parameters: domain_right_edge         = [4. 1. 1.]
yt : [INFO     ] 2024-10-04 11:42:12,975 Parameters: cosmological_simulation   = 0
yt : [INFO     ] 2024-10-04 11:42:13,137 xlim = 0.000000 4.000000
yt : [INFO     ] 2024-10-04 11:42:13,138 ylim = 0.000000 1.000000
yt : [INFO     ] 2024-10-04 11:42:13,141 xlim = 0.000000 4.000000
yt : [INFO     ] 2024-10-04 11:42:13,141 ylim = 0.000000 1.000000
yt : [INFO     ] 2024-10-04 11:42:13,143 Making a fixed resolution buffer of (('gas', 'density')) 800 by 800
yt : [INFO     ] 2024-10-04 11:42:13,586 Saving plot data.0001.flt.h5_Slice_z_density.png

[DOC] Minor correction in documentation

neutrinoceros · 2024-10-04T07:10:46Z

This is awesome, thanks a lot for doing this !
I will make sure to review in detail soon, but just a couple very quick comments right off the bat:

In you test script, note that

import yt_idefix

can be omitted if you have yt>=4.2.0 (which I think yt_idefix requires anyway).
In the upcoming yt 4.4.0, you'll also be able to install as pip install 'yt[idefix]' so users don't even need to know that yt_idefix is a separate package.

Otherwise, regarding the PR itself, keep in mind we'll need to add at least one formal test for it, but as it'll require using git-lfs (which I'm allergic to), we can do that part later / separately.

neutrinoceros · 2024-10-10T07:59:52Z

quick update: I have a lot going on at the moment and I don't think I'll be able to prioritize this review before next week, but I'm not forgetting about it. Thanks for your patience !

dutta-alankar · 2024-10-13T09:41:24Z

No hurries at all.

neutrinoceros · 2024-11-28T10:10:59Z

I just pushed a couple changes to the class hierarchy and xdmf tests so that parametrized tests still pass with your dataset included (but I didn't commit the data itself yet).

neutrinoceros

Sorry this took me so long to get back to you. I pushed a couple simplifications myself and added some additional comments and suggestions. I think we need to try and deduplicate as much as possible before I can merge.

neutrinoceros · 2024-11-28T10:22:19Z

+            key_entry = ""
+            for key in base_groups:
+                if "Timestep_" in key:
+                    key_entry = key
+                    break
+            if key_entry == "":
+                fileh.close()
+                return False


is it possible that key == "" ? if not, we should simplify the logic here as

Suggested change

key_entry = ""

for key in base_groups:

if "Timestep_" in key:

key_entry = key

break

if key_entry == "":

fileh.close()

return False

for key in base_groups:

if "Timestep_" in key:

break

else:

fileh.close()

return False

(in case you to read more about this uncommon pattern: https://docs.python.org/3/tutorial/controlflow.html#else-clauses-on-loops)

neutrinoceros · 2024-11-28T10:24:26Z

+            attributes = list(fileh[f"/{key_entry}"].attrs.keys())
+            if "version" not in attributes:
+                fileh.close()
+                return False


we can probably make this slightly more efficient by reading keys eagerly instead of greedily

Suggested change

attributes = list(fileh[f"/{key_entry}"].attrs.keys())

if "version" not in attributes:

fileh.close()

return False

for key in fileh[f"/{key_entry}"].attrs.keys():

if key == "version":

break

else:

fileh.close()

return False

neutrinoceros · 2024-11-28T10:31:09Z

+
+        # parse the grid
+        coords = h5_io.read_grid_coordinates(
+            self.filename, geometry=self.attributes["geometry"]


the only difference (seems to me) with PlutoXdmfDataset._parse_parameter_file is where the geometry is read from, though they can be unified by using the already set self.geometry attribute instead. I'll push a commit to this effect.

neutrinoceros · 2024-11-28T10:34:40Z

+            base_groups = list(h5f.keys())
+            key_entry = ""
+            for key in base_groups:
+                if "Timestep_" in key:
+                    key_entry = key
+                    break


Suggested change

base_groups = list(h5f.keys())

key_entry = ""

for key in base_groups:

if "Timestep_" in key:

key_entry = key

break

for key in h5f.keys():

if "Timestep_" in key:

key_entry = key

break

else:

key_entry = ""

(though this should be equivalent to what you have, I am not sure about the else clause here. I think we should probably raise a RuntimeError directly)

neutrinoceros · 2024-11-28T10:36:33Z

+            self.current_time = float(h5f[f"/{key_entry}"].attrs["time"])
+            version_line = h5f[f"/{key_entry}"].attrs["version"][0].decode()
+            attributes = list(h5f[f"/{key_entry}"].attrs.keys())
+            self.attributes = {}


In general, we shouldn't need to add arbitrary attributes to Dataset instances: that's already what self.parameters is for. Is it somehow not appropriate here ?

neutrinoceros · 2024-11-28T10:38:16Z

+        match = re.search(r"\d+\.\d+\.?\d*[-\w+]*", version_line)
+        if match is None:
+            warnings.warn(
+                "Could not determine code version from file HDF5 file attribute",
+                stacklevel=2,
+            )
+            return "unknown"
+
+        return match.group()


I'd like to avoid warnings here because it's very hard to know what stacklevel value is relevant (and it may depend on the situation)

Suggested change

match = re.search(r"\d+\.\d+\.?\d*[-\w+]*", version_line)

if match is None:

warnings.warn(

"Could not determine code version from file HDF5 file attribute",

stacklevel=2,

)

return "unknown"

return match.group()

if (res := re.match(r"\d+\.\d+\.?\d*[-\w+]*", version_line)) is not None:

return res.group()

else:

return ""

neutrinoceros · 2024-11-28T10:42:53Z

+        self.geometry = Geometry(self.parameters["definitions"]["geometry"])
+
+    @override
+    def _set_code_unit_attributes(self):


this function, as well as invalid_unit_combinations and default_units, seem to be 100% identical to what's in StaticPlutoDataset. Is this so ? It's not obvious to me how to avoid duplication here, but I think we should give it a try.

neutrinoceros · 2024-11-28T11:15:05Z

+            self.parameters["definitions"][attribute] = self.attributes[attribute]
+
+
+class PlutoVtkDataset(VtkMixin, StaticPlutoDataset):


this file is becoming a mess, I get it, but reorganizing the code makes git blame less useful so I tend to avoid moving classes around just for the sake of organizing. It should also be avoided in a PR that doesn't litteraly anything else because it inflates the diff and makes reviewing harder than it needs to be.

dutta-alankar · 2024-11-28T12:49:35Z

Sorry this took me so long to get back to you. I pushed a couple simplifications myself and added some additional comments and suggestions. I think we need to try and deduplicate as much as possible before I can merge.

Yes sure. I agree with you. Let me go through this and get back to you.

dutta-alankar · 2024-11-30T23:17:04Z

@neutrinoceros A quick update. I need a few more days to get through everything you suggested and address them as I'm traveling right now. Sorry for the delay.

neutrinoceros · 2024-12-01T06:53:54Z

Thanks, but no rush !

dutta-alankar added 3 commits June 21, 2023 10:38

Merge pull request #4 from dutta-alankar/doc-patch

d5ea13d

[DOC] Minor correction in documentation

Merge branch 'neutrinoceros:main' into main

0449cfb

[ENH] Idefix XDMF support

6599921

dutta-alankar mentioned this pull request Oct 3, 2024

ENH: add support for idefix-xdfm #294

Open

neutrinoceros linked an issue Oct 4, 2024 that may be closed by this pull request

ENH: add support for idefix-xdfm #294

Open

fixup class hierarchy

991580c

neutrinoceros force-pushed the idefix_xdmf branch from 702e6ec to 991580c Compare November 28, 2024 10:11

RFC: unify _parse_parameter_file implementations for xdmf datasets

6a01779

neutrinoceros requested changes Nov 28, 2024

View reviewed changes

Merge branch 'main' into idefix_xdmf

6f4c6fd

neutrinoceros reviewed Nov 28, 2024

View reviewed changes

neutrinoceros added enhancement New feature or request code: Idefix Specific to Idefix io: hdf5 Specific to xdmf/hdf5 files labels Jan 16, 2025

		self.parameters["definitions"][attribute] = self.attributes[attribute]


		class PlutoVtkDataset(VtkMixin, StaticPlutoDataset):

Conversation

dutta-alankar commented Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neutrinoceros commented Oct 4, 2024

Uh oh!

neutrinoceros commented Oct 10, 2024

Uh oh!

dutta-alankar commented Oct 13, 2024

Uh oh!

neutrinoceros commented Nov 28, 2024

Uh oh!

neutrinoceros left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dutta-alankar commented Nov 28, 2024

Uh oh!

dutta-alankar commented Nov 30, 2024

Uh oh!

neutrinoceros commented Dec 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dutta-alankar commented Oct 3, 2024 •

edited

Loading