Skip to content

Add memory usage info reduced diagnostic #5803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: development
Choose a base branch
from

Conversation

n01r
Copy link
Member

@n01r n01r commented Mar 27, 2025

This PR adds a new reduced diagnostic that uses AMReX's PrintUsageToFiles and produces a text file for each rank with detailed memory usage information.

A demonstration of the new diagnostic was added to the laser-ion acceleration example, where it can become helpful in debugging given that the computational load imbalance is high and on larger production cases one might see local out-of-memory errors.

The output looks like this, e.g. in MPR.0 from the laser-ion example

Memory usage at step 0, time = 0.00000000000000e+00
    Total GPU global memory (MB): 7966
    Free  GPU global memory (MB): 1571
    [The         Arena] space allocated (MB): 2987
    [The         Arena] space used      (MB): 53
    [The         Arena]: 1 allocs, 261 busy blocks, 2 free blocks
    [The Managed Arena] space allocated (MB): 8
    [The Managed Arena] space used      (MB): 0
    [The Managed Arena]: 1 allocs, 0 busy blocks, 1 free blocks
    [The  Pinned Arena] space allocated (MB): 24
    [The  Pinned Arena] space used      (MB): 0
    [The  Pinned Arena]: 3 allocs, 64 busy blocks, 6 free blocks

n01r and others added 2 commits March 27, 2025 16:43
It uses AMReX's PrintUsageToFiles and produces a text file for each rank with
detailed memory usage information.
@n01r n01r requested review from ax3l, RemiLehe and EZoni March 27, 2025 23:53
@n01r n01r added the component: diagnostics all types of outputs label Mar 27, 2025
Otherwise, we would get file names like `MemoryPerRankMPR.0`
@RemiLehe RemiLehe self-assigned this Mar 31, 2025
@RemiLehe RemiLehe assigned EZoni and unassigned RemiLehe Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: diagnostics all types of outputs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants