Conversation
…I/O handler, and fields Key differences from RAMSES: - Stream-based binary I/O (no Fortran record markers) - info.txt has nfile first (not ncpu) - File naming: amr.NNNNN, hydro.NNNNN (not amr_NNNNN.outNNNNN) - Separate particle files per type (part, star, sink, tree, trac) - Float32 primitive variables in hydro output - Compact AMR format (ckey + refined per grid) - Multi-integer Hilbert keys with different state diagrams Co-authored-by: James11222 <49885659+James11222@users.noreply.github.com>
…parameter Co-authored-by: James11222 <49885659+James11222@users.noreply.github.com>
…d-reader Add mini-ramses frontend reader
…hydro, sinks, and stars
There was a problem hiding this comment.
Pull request overview
This PR introduces a new mini_ramses frontend in yt, including dataset/index/IO plumbing, field definitions, and a synthetic-output test suite, and registers the frontend in yt.frontends.
Changes:
- Added
MiniRAMSESDataset/MiniRAMSESIndex/domain subset + file sanitizer for mini-ramses outputs. - Added an IO handler for fluid (hydro/gravity) and particle stream-binary reads, plus basic field definitions.
- Added pytest-based synthetic-output tests and registered
mini_ramsesin the frontend package list.
Reviewed changes
Copilot reviewed 7 out of 9 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
yt/frontends/mini_ramses/data_structures.py |
Core dataset/index/octree construction, file validation, and fluid data extraction logic. |
yt/frontends/mini_ramses/io.py |
Fluid selection and particle IO routines for mini-ramses stream binary files. |
yt/frontends/mini_ramses/fields.py |
Declares known fluid/particle fields and a derived temperature field. |
yt/frontends/mini_ramses/definitions.py |
Filename patterns/regex and field alias groupings for mini-ramses. |
yt/frontends/mini_ramses/api.py |
Exposes the frontend’s public API symbols. |
yt/frontends/mini_ramses/tests/test_mini_ramses.py |
Synthetic mini-ramses output writer and basic dataset/sanitizer tests. |
yt/frontends/mini_ramses/tests/__init__.py |
Test package marker for the new frontend. |
yt/frontends/__init__.py |
Registers mini_ramses in the frontend namespace. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| import os | ||
| import struct | ||
| import tempfile |
There was a problem hiding this comment.
tempfile is imported but never used, which will fail Ruff (F401) under the repo's lint configuration. Please remove the unused import (or use it if intended).
| import tempfile |
| def fill(self, fn, fields, selector, field_list): | ||
| """Read and fill field data from a mini-ramses binary file. | ||
|
|
||
| Parameters | ||
| ---------- |
There was a problem hiding this comment.
This docstring block includes blank lines with trailing whitespace (e.g. immediately after the opening docstring). Ruff/pycodestyle will flag this as W293/W291. Please strip trailing whitespace throughout this file (running the repo formatter/linter should handle it).
| self, base_region, domain, ds, over_refine_factor=1, num_ghost_zones=0 | ||
| ): | ||
| super().__init__( | ||
| base_region, domain, ds, over_refine_factor, num_ghost_zones | ||
| ) |
There was a problem hiding this comment.
MiniRAMSESDomainSubset.__init__ passes over_refine_factor positionally into OctreeSubset(..., num_zones=...). For RAMSES-style octs this should be num_zones=2 (2×2×2 cells per oct); passing 1 sets num_zones=1 and can break reshaping/ghost-zone logic. Consider renaming the parameter to num_zones (default 2) and/or passing by keyword to avoid this mismatch.
| # Separate fields by type | ||
| ftypes = {f[0] for f in fields} | ||
|
|
||
| for chunk in chunks: |
There was a problem hiding this comment.
There are blank lines containing whitespace in this block (e.g. after ftypes = ...). Ruff/pycodestyle will flag this as W293/W291. Please strip trailing whitespace (running the configured formatter/linter should fix it automatically).
| self.base_region, | ||
| self._base_domain, | ||
| self.ds, | ||
| self._num_ghost_zones, |
There was a problem hiding this comment.
retrieve_ghost_zones passes self._num_ghost_zones as the 4th positional argument to MiniRAMSESDomainSubset(...), which maps to num_zones in OctreeSubset.__init__. This will incorrectly set num_zones to the previous ghost-zone count. Pass the existing self._num_zones/num_zones=2 and set num_ghost_zones=ngz separately.
| self._num_ghost_zones, | |
| self._num_zones, |
| self.min_level = params.get("levelmin", 1) | ||
| self.max_level = params.get("levelmax", 1) | ||
| self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** ( | ||
| self.min_level |
There was a problem hiding this comment.
MiniRAMSESDataset sets min_level/max_level directly from info.txt and then uses domain_dimensions = 2**min_level. In the RAMSES frontend, ds.min_level is stored as levelmin - 1 and ds.max_level is stored relative to ds.min_level to keep octree level math consistent. Aligning to that convention will make ds.index.max_level, level stats, and cell-size calculations consistent across octree frontends.
| self.min_level = params.get("levelmin", 1) | |
| self.max_level = params.get("levelmax", 1) | |
| self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** ( | |
| self.min_level | |
| levelmin = params.get("levelmin", 1) | |
| levelmax = params.get("levelmax", 1) | |
| # Align level bookkeeping with the RAMSES frontend convention: | |
| # - store min_level as levelmin - 1 | |
| # - store max_level relative to min_level | |
| self.min_level = levelmin - 1 | |
| self.max_level = levelmax - self.min_level | |
| # The coarse grid domain has 2**levelmin cells per dimension. | |
| # Since levelmin = self.min_level + 1, use (self.min_level + 1) here | |
| # to keep domain_dimensions consistent with the RAMSES frontend. | |
| self.domain_dimensions = np.ones(3, dtype="int64") * 2 ** ( | |
| self.min_level + 1 |
| def test_load_dataset(self, tmp_path): | ||
| outdir = _create_mini_ramses_output(str(tmp_path)) | ||
| ds = MiniRAMSESDataset(outdir) | ||
| assert ds is not None | ||
| assert ds.dimensionality == 3 |
There was a problem hiding this comment.
The tests validate dataset detection and basic attributes, but they never actually access a fluid/particle field through yt data objects to exercise the IO handler + selection plumbing end-to-end. Adding at least one assertion that ds.all_data()[('mini-ramses','Density')] (and a particle field like ('io','particle_mass')) can be read and returns an array of the expected size would significantly improve coverage of the new frontend.
| twotondim = 2**ndim | ||
|
|
||
| with open(domain.amr_fn, "rb") as f: | ||
| ndim_file = struct.unpack("i", f.read(4))[0] |
There was a problem hiding this comment.
ndim_file is unpacked from the AMR header but never used. This will trigger Ruff F841, and also misses a cheap sanity check (e.g. verify it matches ds.dimensionality). Either validate it or unpack into _ to explicitly ignore it.
| ndim_file = struct.unpack("i", f.read(4))[0] | |
| ndim_file = struct.unpack("i", f.read(4))[0] | |
| if ndim_file != ndim: | |
| mylog.warning( | |
| "Dimensionality mismatch in AMR header for domain %s: " | |
| "file reports %dD, dataset expects %dD.", | |
| domain.domain_id, | |
| ndim_file, | |
| ndim, | |
| ) |
| ) | ||
|
|
||
| # Read refinement map (twotondim int32 values) | ||
| refined = np.array( |
There was a problem hiding this comment.
refined is read for every grid but never used, which will fail Ruff (F841). If refinement info is intentionally ignored, unpack into _ (or delete the variable) to satisfy lint. Otherwise, consider using it for validation (e.g. ensure refinement flags are consistent with the octs present at the next level).
| refined = np.array( | |
| _refined = np.array( |
| for icell in range(twotondim): | ||
| raw_data[fname][cell_offset + icell] = oct_data[fidx, icell] |
There was a problem hiding this comment.
The nested Python loops copying oct_data into raw_data (for fname... / for icell...) will be very slow for large outputs. This can be vectorized by slice-assigning oct_data[fidx, :] into raw_data[fname][cell_offset:cell_offset + twotondim], and ideally you should avoid reading/copying data for fields that weren't requested by the caller.
| for icell in range(twotondim): | |
| raw_data[fname][cell_offset + icell] = oct_data[fidx, icell] | |
| # Vectorized copy: assign all cells for this field in one slice | |
| raw_data[fname][cell_offset:cell_offset + twotondim] = oct_data[fidx, :] |
cphyc
left a comment
There was a problem hiding this comment.
This is great! I'm on holidays but ping me if I haven't reviewed this come early March, I personally really want this included and shipped with the next release of yt.
Also, do you have a small-ish dataset (<1GiB) you could share so that we could host it on https://yt-project.org/data/?
|
ya definitely, I can certainly get a nice small sample dataset to be hosted for you in the next week or so. |
chrishavlin
left a comment
There was a problem hiding this comment.
awesome! a couple of very quick notes on the new test file you added:
- super helpful to have the test itself mock up a dataset to use in the test. thank you!
- you'll want to exclude this test from the old nose testing framework that yt still uses for some of its tests: update the
nose_ignoresfile at the yt root directory as well astests/tests.yaml(add the filename to the the end oftests.yamlwhere it has a list of files to ignore)
PR Summary
This pull request includes a new fully functional frontend for mini-ramses. With the assistance of github copilot I was able to learn from the valuable insights of Corentin and managed to get a tested, fully functional (and I think just as fast as ramses) frontend working for mini-ramses. I tested on cosmological and non-cosmological simulations and it appears to be working identically to the RAMSES frontend. Currently I have it reading in all particle data (dark matter, star, sink) and AMR data (grav, hydro). I have not ensured compatibility with MHD or RT mini-ramses simulations yet.
Example plots made with fixed resolution buffers and particle data showcase it working in action.
Fixes Issue: #5215
PR Checklist