Optimize GenerateXdmf by hen-w · Pull Request #7296 · sxs-collaboration/spectre

hen-w · 2026-06-12T18:04:09Z

Proposed changes

Various optimizations of GenerateXdmf.py. This results in 60 times speed up for large volume files.

Upgrade instructions

Code review checklist

The code is documented and the documentation renders correctly. Run
make doc to generate the documentation locally into BUILD_DIR/docs/html.
Then open index.html.
The code follows the stylistic and code quality guidelines listed in the
code review guide.
The PR lists upgrade instructions and is labeled bugfix or
new feature if appropriate.
If a coding agent is used, have one of
"Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com",
"Co-Authored-by: Codex noreply@openai.com", or
"Co-Authored-By: GitHub Copilot CLI noreply@microsoft.com"
as the last line of the commit, depending on the agent.

Further comments

Instead of loading and walking the mixed-topology connectivity array in Python, read the cell count from len(observation["ElementId"]) which is free HDF5 metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Avoid repeated HDF5 metadata lookups by caching the component list and dtypes. The cache is invalidated when the dataset count changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Cache grid metadata (num_points, number_of_cells, connectivity lengths) in _ObservationCache. In fast mode, all HDF5 metadata is read only once and reused for every timestep. This gives a large speedup on slow filesystems when the grid and variables are static. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Bug 1: the cell count read once from 'ElementId' counts the *main* connectivity's cells and was then reused for the pole-filling Mixed topology, whose 'pole_connectivity' has a different, time-varying cell count. The pole topology therefore reported the main grid's NumberOfElements. Restore counting the pole connectivity's own cells (per observation, since it can change between timesteps); the main connectivity keeps the free len(ElementId) count. Bug 2: the component/dtype cache was invalidated only when the *number* of datasets changed. If the set of observed fields changes while the count stays the same, the stale field list was reused, emitting an attribute that points at a non-existent dataset and dropping the field that is actually present. Caching is now gated entirely on --fast-mode: the default mode re-reads component names and dtypes on every observation (always correct), while --fast-mode reads them once (the user already asserts that the grid and variables are static). Adds unit tests that trigger both bugs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

wthrowe

Looks fine. Please squash in some manner so that there are not broken commits in the middle.

hen-w added the small Only changes a few lines of code, does a rename or is otherwise quick to review label Jun 12, 2026

hen-w force-pushed the xdmf branch from 4459efc to 24e4c62 Compare June 12, 2026 18:05

hen-w marked this pull request as draft June 12, 2026 20:30

hen-w force-pushed the xdmf branch 3 times, most recently from 0767ef4 to 0a7a69c Compare June 12, 2026 21:03

hen-w removed the small Only changes a few lines of code, does a rename or is otherwise quick to review label Jun 12, 2026

hen-w and others added 2 commits June 12, 2026 17:05

GenerateXdmf: read cell count from ElementId dataset

043a45c

Instead of loading and walking the mixed-topology connectivity array in Python, read the cell count from len(observation["ElementId"]) which is free HDF5 metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

GenerateXdmf: cache component names and dtypes across observations

98e4df7

Avoid repeated HDF5 metadata lookups by caching the component list and dtypes. The cache is invalidated when the dataset count changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

hen-w force-pushed the xdmf branch from 0a7a69c to 7282793 Compare June 12, 2026 21:08

hen-w force-pushed the xdmf branch from 7282793 to 12a504b Compare June 14, 2026 19:10

hen-w marked this pull request as ready for review June 14, 2026 19:12

hen-w force-pushed the xdmf branch from 0fb0bb5 to 12861c1 Compare June 15, 2026 19:32

wthrowe reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize GenerateXdmf#7296

Optimize GenerateXdmf#7296
hen-w wants to merge 4 commits into
sxs-collaboration:developfrom
hen-w:xdmf

hen-w commented Jun 12, 2026 •

edited

Loading

Uh oh!

wthrowe left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hen-w commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Upgrade instructions

Code review checklist

Further comments

Uh oh!

wthrowe left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hen-w commented Jun 12, 2026 •

edited

Loading