Skip to content

Conversation

@ax3l
Copy link
Member

@ax3l ax3l commented Nov 14, 2025

The calculation of the bounds in plotfiles (incl. checkpoints) appears fishy. It relies on the order of the boxes in the box array, which is not always pointing to a box on the lower/upper bound of the simulation when load balancing is used.

This can corrupt checkpoints (plotfiles) when writing, by a) filtering out valid particles & fields and b) remembering the wrong simulation meta-data for the simulation geometry of fields/particles.

Related to #6392

Co-authored-by: @atmyers
Co-authored-by: @RemiLehe
Co-authored-by: @titoiride

Introduced in #1085 (refined in #1241)

Follow-up

BTDiagnostics.cpp contains similar assumptions/logic it seems and needs a similar fix.

The calculation of the bounds in plotfiles (incl. checkpoints)
appears fishy. It relies on the order of the boxes in the
box array, which is not always pointing to a box on the
lower/upper bound of the simulation when load balancing is
used.

This can corrupt checkpoints (plotfiles) when writing, by
a) filtering out valid particles & fields and b) remembering
the wrong simulation meta-data for the simulation geometry
of fields/particles.

Co-authored-by: Andrew Myers <[email protected]>
Co-authored-by: Remi Lehe <[email protected]>
Co-authored-by: Davide Terzani <[email protected]>
@ax3l ax3l requested review from RemiLehe and atmyers November 14, 2025 19:56
@ax3l ax3l added bug Something isn't working bug: affects latest release Bug also exists in latest release version component: load balancing Load balancing strategies, optimization etc. component: checkpoint/restart Checkpointing & restarts labels Nov 14, 2025
@titoiride
Copy link
Member

Is this ready to be tested at full scale or do you need to make some more changes?

@ax3l
Copy link
Member Author

ax3l commented Nov 14, 2025

I tested it on my minimal reproducer and it does not fix the issue we observe. Nonetheless, I would change the logic here because it does not look safe under load balancing.

@WeiqunZhang
Copy link
Member

I agree with the change. Although the existing code does not produce incorrect results, it should be avoided. It has nothing to do with load balancing. But it depends on how amrex does BoxArray::maxSize. If the implementation in amrex changes in the future, it might break.

Co-authored-by: Weiqun Zhang <[email protected]>
@ax3l ax3l added cleaning Clean code, improve readability and removed bug Something isn't working bug: affects latest release Bug also exists in latest release version labels Nov 16, 2025
@ax3l ax3l merged commit 08750da into BLAST-WarpX:development Nov 24, 2025
50 checks passed
@ax3l ax3l deleted the fix-checkpoint-domain-calc branch November 24, 2025 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleaning Clean code, improve readability component: checkpoint/restart Checkpointing & restarts component: load balancing Load balancing strategies, optimization etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants