-
Notifications
You must be signed in to change notification settings - Fork 223
use span to allocate mesh data by default #6123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
this makes it possible to aggregate data for later flush
* if isBTD=true, advice to use ADIOS Put() instead of PDW for better performance. | ||
* in ADIOS, the action will be PerformDataWrite | ||
* if isBTD=true, in ADIOS, the action will be PerformPut | ||
* because no action is taken for the span tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this generally true? You only add spans for fields right now, what if particles are involved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right BTD particles will be calling PerformPut. Right now particles is not the I/O performance bottleneck for BTD. It will happen when you have enormous amount of particles. I will definitely revisit this issue with another pull request.
}); | ||
auto span = dynamicMemoryView.currentBuffer(); | ||
amrex::Gpu::dtoh_memcpy_async(span.data(), fab.dataPtr(icomp), local_box.numPts()*sizeof(amrex::Real)); | ||
} else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this else
branch for D2H
? I thought openPMD-api has a fallback implementation for HDF5...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out no. We do need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be an issue, can you please open an issue in https://github.com/openPMD/openPMD-api/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@franzpoeschel do you understand why this is needed?
auto span = dynamicMemoryView.currentBuffer(); | ||
std::memcpy(span.data(), fab.dataPtr(icomp), local_box.numPts()*sizeof(amrex::Real)); | ||
} | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: I thought openPMD-api has a fallback implementation for Span for backends that do not support it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because the first round of CPU span failed pull request (it uses HDF5) so the fallback does not work with collecting a few buffers (during a snapshot) and then flush out. The fallback works if flushes out immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@guj thank you! As follow-ups, can you please
Thank you! 🙏 |
if (hasADIOS) | ||
{ | ||
WARPX_PROFILE("WarpXOpenPMDPlot::WriteOpenPMDFields::D2H_Span()"); | ||
auto dynamicMemoryView = mesh_comp.storeChunk<amrex::Real>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, when we have an un-equal number of blocks over all MPI ranks, this will not work because storeChunk
is collective :-o openPMD/openPMD-api#1794
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double check: is Iteration::open()
called before these blocks, so the ADIOS engine is definitely open? -- Yes
https://github.com/guj/WarpX/blob/d0bbf79e64beed82b64d29f85ec0a56ed0f5f087/Source/Diagnostics/WarpXOpenPMD.cpp#L1426-L1430
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, when we have an un-equal number of blocks over all MPI ranks, this will not work because
storeChunk
is collective :-o openPMD/openPMD-api#1794
Only one rank is writing when using BTD. Other ranks have no data.
So we can aggregate (mainly for BTD) data for a later flush.
introduced an optional but related entry in the input file:
<diag>.buffer_flush_limit_btd
by default it is 5.New implementation of #3200 but so far only for fields/meshes. Particles will be a follow-up PR.