Add amrex::Gpu::freeAsync #4804

AlexanderSinn · 2025-11-19T17:13:58Z

Summary

This PR adds the function amrex::Gpu::streamFree (Arena* arena, void* mem) that can be used to free memory the next time the current GPU stream is synchronized.

This is based on #4432 but with much reduced complexity from OMP.
The interface is now opt-in and always available, instead of needing to be enabled using runtime parameters.

Additional background

Checklist

The proposed changes:

fix a bug or incorrect behavior in AMReX
add new capabilities to AMReX
changes answers in the test suite to more than roundoff level
are likely to significantly affect the results of downstream AMReX users
include documentation in the code and/or rst files, if appropriate

WeiqunZhang · 2025-11-20T18:53:30Z

/run-hpsf-gitlab-ci

github-actions · 2025-11-20T18:53:39Z

GitLab CI has started at https://gitlab.spack.io/amrex/amrex/-/pipelines/1319922.

amrex-gitlab-ci-reporter · 2025-11-20T22:10:16Z

GitLab CI 1319922 finished with status: success. See details at https://gitlab.spack.io/amrex/amrex/-/pipelines/1319922.

AlexanderSinn · 2025-11-21T18:31:51Z

Maybe this should be named asyncFree or freeAsync instead.

WeiqunZhang · 2025-11-21T20:02:34Z

I like freeAsync, which is similar to copyAsync.

AlexanderSinn · 2025-11-26T14:46:06Z

For a case with very few particles, this helps to significantly improve the performance of hipace shiftSlippedParticles (which calls amrex::partitionParticles) by removing two stream synchronizes.

AMReX dev, 79 µs on the GPU:

Using this PR, 59 µs on the GPU:

WeiqunZhang · 2025-12-01T23:56:22Z

The PR LGTM. I will merge it after the 25.12 release.

@AlexanderSinn Something for the future. Maybe we could use this to replace the CUDA stream ordered memory allocator in the implementation of The_Async_Arena().

AlexanderSinn · 2025-12-03T13:15:33Z

Eventually an allocAsync could be added that can use the memory in m_free_wait_list. This will have stricter usage requirements, mainly that the memory can only be accessed in stream order and not by the host. Additionally, we would need to get the capacity of an allocation from the arena and store it in m_free_wait_list.

I think ultimately The_Async_Arena should use both freeAsync and allocAsync so that in loops that allocate and free a lot with no sync, it can reuse memory effectively. I think this might be very similar to what cudaMallocAsync is doing internally, just with a cross-platform implementation, without the overhead of calling the CUDA API and using The_Arena as backing.

Should I change The_Async_Arena to use freeAsync or add allocAsync first? Or do both at the same time?

WeiqunZhang · 2025-12-03T16:08:54Z

Let's wait until this PR is merged.

I am actually thinking of something simpler. Create a new Arena derived class for The_Async_Arena(). The object contains a map for <void*, int>, where int is the stream index. I's alloc will call The_Arena()->alloc and update the map. Its free will call freeAsync.

AlexanderSinn · 2025-12-03T16:35:56Z

Yes, this will be in a new PR. The_Async_Arena uses PArena which is already separate. Should PArena be changed to use freeAsync or kept as-is but without an interface? Do we need to keep track of the stream index at allocation? I would have just used whatever stream is active when the memory is freed. I can't think of a use case that would have different streams active between alloc and free, so I don't know which version is preferable.

WeiqunZhang · 2025-12-03T16:52:58Z

I think we should leave PArena separated and create a new class. Yes, we probably don't even need to have a map storing the stream index during allocation. That would be even simpler. We can also think about maybe when CArena is about to run out of memory, we could try to immediately free the memory freed by freeAsync.

AlexanderSinn added 2 commits November 19, 2025 18:12

Add amrex::Gpu::streamFree

1a55c7f

fix SYCL const

b37ef23

AlexanderSinn and others added 2 commits November 21, 2025 18:18

Merge branch 'AMReX-Codes:development' into Add_amrex__Gpu__streamFree

02a5b27

add docstring

3629de0

change name

369dfd1

Use freeAsync for all PrefixSum backends

e3771c7

WeiqunZhang changed the title ~~Add amrex::Gpu::streamFree~~ Add amrex::Gpu::freeAsync Dec 1, 2025

WeiqunZhang approved these changes Dec 3, 2025

View reviewed changes

WeiqunZhang merged commit 477c09f into AMReX-Codes:development Dec 3, 2025
73 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add amrex::Gpu::freeAsync #4804

Add amrex::Gpu::freeAsync #4804

AlexanderSinn commented Nov 19, 2025 •

edited

Loading

Uh oh!

WeiqunZhang commented Nov 20, 2025

Uh oh!

github-actions bot commented Nov 20, 2025

Uh oh!

amrex-gitlab-ci-reporter bot commented Nov 20, 2025

Uh oh!

AlexanderSinn commented Nov 21, 2025

Uh oh!

WeiqunZhang commented Nov 21, 2025

Uh oh!

AlexanderSinn commented Nov 26, 2025

Uh oh!

WeiqunZhang commented Dec 1, 2025

Uh oh!

AlexanderSinn commented Dec 3, 2025

Uh oh!

WeiqunZhang commented Dec 3, 2025

Uh oh!

AlexanderSinn commented Dec 3, 2025 •

edited

Loading

Uh oh!

WeiqunZhang commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add amrex::Gpu::freeAsync #4804

Add amrex::Gpu::freeAsync #4804

Conversation

AlexanderSinn commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Additional background

Checklist

Uh oh!

WeiqunZhang commented Nov 20, 2025

Uh oh!

github-actions bot commented Nov 20, 2025

Uh oh!

amrex-gitlab-ci-reporter bot commented Nov 20, 2025

Uh oh!

AlexanderSinn commented Nov 21, 2025

Uh oh!

WeiqunZhang commented Nov 21, 2025

Uh oh!

AlexanderSinn commented Nov 26, 2025

Uh oh!

WeiqunZhang commented Dec 1, 2025

Uh oh!

AlexanderSinn commented Dec 3, 2025

Uh oh!

WeiqunZhang commented Dec 3, 2025

Uh oh!

AlexanderSinn commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WeiqunZhang commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AlexanderSinn commented Nov 19, 2025 •

edited

Loading

AlexanderSinn commented Dec 3, 2025 •

edited

Loading