Skip to content

Isolate CUDA #4499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Isolate CUDA #4499

wants to merge 15 commits into from

Conversation

michel2323
Copy link
Collaborator

@michel2323 michel2323 commented May 12, 2025

This PR isolates CUDA into src/arch_cuda.jl. This removes any direct CUDA calls in the remaining Oceananigans code base. That feel can either serve as a template for a new GPU architecture or for a future CUDA extension. @vchuravy

src/arch_cuda.jl Outdated
Comment on lines 93 to 100
CUDA.@device_override @inline function __validindex(ctx::UT.MappedCompilerMetadata)
if __dynamic_checkbounds(ctx)
index = @inbounds UT.linear_expand(__iterspace(ctx), blockIdx().x, threadIdx().x)
return index ≤ UT.__linear_ndrange(ctx)
else
return true
end
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm. what the heck is this thing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because KA doesn't support index maps, right @simone-silvestri ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was this PR #3920. I am happy to try to implement it in KA if you want @vchuravy, I thought the conclusion was you didn't want it there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I didn't want to stifle development in that direction. If I remember my thoughts from back then my feeling was more that I was the wrong person to implement this since I didn't have an active need.

But since then multiple people have build something similar, so having a sensible solution in KA would make me happy

src/arch_cuda.jl Outdated
Comment on lines 39 to 45
AC.on_architecture(::CUDAGPU, a::Array) = CuArray(a)
AC.on_architecture(::CUDAGPU, a::CuArray) = a
AC.on_architecture(::CUDAGPU, a::BitArray) = CuArray(a)
AC.on_architecture(::CUDAGPU, a::SubArray{<:Any, <:Any, <:CuArray}) = a
AC.on_architecture(::CUDAGPU, a::SubArray{<:Any, <:Any, <:Array}) = CuArray(a)
AC.on_architecture(::AC.CPU, a::SubArray{<:Any, <:Any, <:CuArray}) = Array(a)
AC.on_architecture(::CUDAGPU, a::StepRangeLen) = a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run this ought to be adapt(backend, a)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that would be a big negative diff to combine on_architecture and adapt

src/arch_cuda.jl Outdated

AC.architecture(::CuArray) = CUDAGPU()
AC.architecture(::CuSparseMatrixCSC) = AC.GPU()
AC.architecture(::SparseMatrixCSC) = AC.CPU()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be elsewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, the CPU method goes in Architectures

src/arch_cuda.jl Outdated
Comment on lines 32 to 33
AC.architecture(::CuArray) = CUDAGPU()
AC.architecture(::CuSparseMatrixCSC) = AC.GPU()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I would like to see this be just get_backend

@glwagner
Copy link
Member

Possibly, we should simply implement a CUDA extension in this PR with appropriate organization of the code and get on with the breaking change!

tl;dr then after this is merged, anybody doing computations on nvidia GPU has to write

using Oceananigans
using CUDA

@glwagner
Copy link
Member

@simone-silvestri curious to hear your thoughts

@simone-silvestri
Copy link
Collaborator

I think it's a good idea. It provides templates to add new architectures and makes the code completely architecture agnostic. the extra using CUDA is a small price to pay.

@navidcy navidcy added the GPU 👾 Where Oceananigans gets its powers from label May 13, 2025
@michel2323 michel2323 force-pushed the ms/ka branch 2 times, most recently from 69eb545 to 59a441d Compare May 16, 2025 16:08
@@ -4,7 +4,7 @@ using Oceananigans.Architectures

function versioninfo_with_gpu()
s = sprint(versioninfo)
if CUDA.has_cuda()
if isdefined(Main, :CUDABackend)
gpu_name = CUDA.CuDevice(0) |> CUDA.name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this will fail, since CUDA is not defined.

Solving this is tricky, but most likely you will want to define a callback that the extension registers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change how this function works; it should accept the architecture. Right now it is only used for NetCDF output writing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to fix this, let me know what to do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GPU 👾 Where Oceananigans gets its powers from
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants