Reduce memory overhead for HIP backends on MI300A GPUs #1734

zatkins-dev · 2025-01-24T23:08:07Z

Prevents double allocations for CeedVector when using HIP vector with unified addressing and XNACK.

Also, updates more of the HIP vector operations to use hipBLAS functions rather than custom kernels.

backends/hip-ref/ceed-hip-ref-basis.c

backends/hip/ceed-hip-common.h

backends/hip/ceed-hip-compile.cpp

backends/hip-ref/ceed-hip-ref-basis.c

backends/hip/ceed-hip-compile.cpp

jeremylt · 2025-02-11T19:30:02Z

backends/hip-ref/ceed-hip-ref-basis.c

  const CeedScalar *d_x, *d_u;
  CeedScalar       *d_v;
  CeedBasis_Hip    *data;
+  Ceed_Hip         *hip_data;


another stray

jeremylt · 2025-02-11T19:30:04Z

backends/hip-ref/ceed-hip-ref-basis.c

  }

  CeedCallBackend(CeedBasisGetCeed(basis, &ceed));
+  CeedCallBackend(CeedGetData(ceed, &hip_data));


…unified addressing and XNACK

zatkins-dev · 2025-02-13T22:34:02Z

backends/hip-ref/ceed-hip-ref-vector.c

+    CeedVector_Hip *impl;
+
+    CeedCallBackend(CeedVectorGetData(vec, &impl));
+    CeedCallHip(CeedVectorReturnCeed(vec), hipDeviceSynchronize());


Ratel seems to work fine without this line, and is faster

Does CeedVectorSyncArray mean that one could immediately start an MPI_Send? If the host doesn't know that the previous kernel (writing to the array) has completed, then it would be racy to call MPI_Send. (Might be rare to trip, but we don't want that kind of bug.)

If our sends are using a kernel for packing (on the same stream), then the host doesn't need to know when the earlier stuff completes, but we still need to sync after the packing kernel.

that's a fair point, I think that we need to be a bit more careful and only sync when the host needs the data. Otherwise this acts as a hard sync with the GPU, which seems to have performance impacts.

jrwrigh · 2025-02-26T01:23:27Z

FYI, generally prefer rebase to merge for dev branches. It doesn't matter for squash-merges (the commit history gets nuked anyways), but for normal merges it helps the git history be more regular.

zatkins-dev · 2025-02-26T01:36:14Z

FYI, generally prefer rebase to merge for dev branches. It doesn't matter for squash-merges (the commit history gets nuked anyways), but for normal merges it helps the git history be more regular.

Yeah generally I agree - I need to strip down this branch and rebuild it probably, it's currently a mess due to changes at the AMD workshop.

jeremylt · 2025-02-28T23:51:29Z

I think we want to merge this before the review so we can have libCEED 0.13 and Ratel 0.4 with this? Is the question of the sync call the big blocker right now?

jeremylt · 2025-03-04T18:25:53Z

Restriction offset arrays may also want this

zatkins-dev · 2025-03-04T20:56:55Z

I think we want to merge this before the review so we can have libCEED 0.13 and Ratel 0.4 with this? Is the question of the sync call the big blocker right now?

I think so? To be honest, I've been more focused on MPM work in Ratel and haven't had much time to work on this.

If you have more time and desire to extract the changes into a clean branch, I'd be happy to review. Otherwise, I probably won't get to it until late this week at the earliest, more likely next week.

jeremylt · 2025-03-04T21:32:16Z

No rush - it was just a thought that crossed my mind as we look forward to future activities for libCEED

jeremylt · 2025-03-19T00:00:11Z

I'd like to get this in for the release. If you're still tight on time, I can tidy up the branch. The discussion above about syncing seems to be the real sticking point though.

zatkins-dev · 2025-03-20T14:16:02Z

I'd like to get this in for the release. If you're still tight on time, I can tidy up the branch. The discussion above about syncing seems to be the real sticking point though.

if you have time, that would be great. Ultimately, I think we need to sync when a sync is requested for correctness.

jeremylt · 2025-03-20T20:57:29Z

See #1788

jeremylt · 2025-03-20T22:48:05Z

Transferred to #1788

jeremylt reviewed Jan 24, 2025

View reviewed changes

backends/hip-ref/ceed-hip-ref-basis.c Outdated Show resolved Hide resolved

jeremylt reviewed Jan 24, 2025

View reviewed changes

backends/hip/ceed-hip-common.h Show resolved Hide resolved

jeremylt reviewed Jan 24, 2025

View reviewed changes

backends/hip/ceed-hip-compile.cpp Outdated Show resolved Hide resolved

zatkins-dev added the 0-WIP label Jan 24, 2025

jeremylt reviewed Jan 27, 2025

View reviewed changes

backends/hip-ref/ceed-hip-ref-basis.c Outdated Show resolved Hide resolved

jeremylt reviewed Jan 27, 2025

View reviewed changes

backends/hip-ref/ceed-hip-ref-basis.c Show resolved Hide resolved

jeremylt reviewed Feb 11, 2025

View reviewed changes

backends/hip/ceed-hip-compile.cpp Outdated Show resolved Hide resolved

zatkins-dev self-assigned this Feb 11, 2025

jeremylt added GPU HIP labels Feb 11, 2025

jeremylt reviewed Feb 11, 2025

View reviewed changes

zatkins-dev force-pushed the zach/hip-mi300a branch from 9dd589f to 35926f8 Compare February 12, 2025 18:55

zatkins-dev added 6 commits February 12, 2025 14:25

Prevent double allocations for CeedVector when using HIP vector with …

7906003

…unified addressing and XNACK

Undo changes to HIP compile

662c76b

Add fabs to max norm

53ac46c

Change MI300A to use hipMalloc per LC tips

73bbc6b

Remove unused code

f156410

update gitignore

5ad657d

zatkins-dev force-pushed the zach/hip-mi300a branch from 35926f8 to 5ad657d Compare February 12, 2025 22:41

zatkins-dev and others added 3 commits February 12, 2025 15:54

Remove hipMemset for unified addressing

0d065eb

Merge branch 'main' into zach/hip-mi300a

659ff2f

Fix norms

d85d813

zatkins-dev commented Feb 13, 2025

View reviewed changes

Merge branch 'main' of github.com:CEED/libCEED into zach/hip-mi300a

aa38ec3

jeremylt added this to the v0.13 milestone Mar 19, 2025

jeremylt mentioned this pull request Mar 20, 2025

Unified memory for HIP #1788

Merged

jeremylt closed this Mar 20, 2025

Reduce memory overhead for HIP backends on MI300A GPUs #1734

Reduce memory overhead for HIP backends on MI300A GPUs #1734

Uh oh!

Conversation

zatkins-dev commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jeremylt Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

jeremylt Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

zatkins-dev Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

jedbrown Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

zatkins-dev Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

jrwrigh commented Feb 26, 2025

Uh oh!

zatkins-dev commented Feb 26, 2025

Uh oh!

jeremylt commented Feb 28, 2025

Uh oh!

jeremylt commented Mar 4, 2025

Uh oh!

zatkins-dev commented Mar 4, 2025

Uh oh!

jeremylt commented Mar 4, 2025

Uh oh!

jeremylt commented Mar 19, 2025

Uh oh!

zatkins-dev commented Mar 20, 2025

Uh oh!

jeremylt commented Mar 20, 2025

Uh oh!

jeremylt commented Mar 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zatkins-dev commented Jan 24, 2025 •

edited

Loading