Skip to content

AMD GPU Support via HIP/ROCm#1379

Draft
ptheywood wants to merge 96 commits into
masterfrom
amdgpu
Draft

AMD GPU Support via HIP/ROCm#1379
ptheywood wants to merge 96 commits into
masterfrom
amdgpu

Conversation

@ptheywood
Copy link
Copy Markdown
Member

Adds AMD GPU support via HIP / ROCm

Caution

This is still a WIP - do not review, merge or expect CI to be happy

It will be rebased many times, and may become the base branch for other AMD/HIP/ROCm/clang related PRs so they can all be merged into master in a single go.

Warning

As of 2026-04-09 offline-C++ compilation-only workflows compile but kernels do not correctly execute.
Leaning towards UB for taking the address of __global__ functions and using for occupancy API / launching kernels on HIP.
This is explicitly documented as being supported by CUDA, but a similar statement does not appear in the HIP docs.
It works in a toy-problem using the same macro and templated approach, however (hence UB?). I have some ideas on how to narrow this down.

ptheywood and others added 30 commits April 2, 2026 12:00
Our minimum CMake updated for CUDA C++20 allows us to use this.
This section of the readme could do with improving now we have a 3D support matrix
Still lots of changes to make, but atleast with hipclang as the host compiler it gets to the first include <cuda_runtime.h>

Doesn't handle architectures propperly, though just the func needs implementing.

Lots of author warnigns for bits I skipped throgh

Need to test / improve what happens when a project() has CUDA but FLAMEGPU_GPU=HIP is selected. Due to order of execution, it is skipping my error condition for that?

WIP: More amd cmake
The readme should be improved once all the changes for HIP/ROCm are known, as the existing structure is not ideal with all the new complexity
…-x hip with HIP enabled.

This is not a fatal error, in case rocm/hip change their behaviour, though probably could/should be.
…rnings

As we require CMake >= 3.25 we can use the SYSTEM argument for FetchContent_Declare
This condition will always be true for CUDA builds
FLAMEGPUDeviceException.cu now compiles via hip
Some code is just hidden behind macro guards for now, which needs explicit hip versions adding later
ptheywood added 2 commits May 1, 2026 14:59
…lity which is cuda-only.

Usage is guarded out, and previously macro'd out tests are now enabled (but heterogenous AMD systems may encounter test failures)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Initial AMD GPU Support (ROCm/HIP)

1 participant