-
Notifications
You must be signed in to change notification settings - Fork 236
Isolate CUDA #4499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Isolate CUDA #4499
Conversation
src/arch_cuda.jl
Outdated
CUDA.@device_override @inline function __validindex(ctx::UT.MappedCompilerMetadata) | ||
if __dynamic_checkbounds(ctx) | ||
index = @inbounds UT.linear_expand(__iterspace(ctx), blockIdx().x, threadIdx().x) | ||
return index ≤ UT.__linear_ndrange(ctx) | ||
else | ||
return true | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm. what the heck is this thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because KA doesn't support index maps, right @simone-silvestri ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I didn't want to stifle development in that direction. If I remember my thoughts from back then my feeling was more that I was the wrong person to implement this since I didn't have an active need.
But since then multiple people have build something similar, so having a sensible solution in KA would make me happy
src/arch_cuda.jl
Outdated
AC.on_architecture(::CUDAGPU, a::Array) = CuArray(a) | ||
AC.on_architecture(::CUDAGPU, a::CuArray) = a | ||
AC.on_architecture(::CUDAGPU, a::BitArray) = CuArray(a) | ||
AC.on_architecture(::CUDAGPU, a::SubArray{<:Any, <:Any, <:CuArray}) = a | ||
AC.on_architecture(::CUDAGPU, a::SubArray{<:Any, <:Any, <:Array}) = CuArray(a) | ||
AC.on_architecture(::AC.CPU, a::SubArray{<:Any, <:Any, <:CuArray}) = Array(a) | ||
AC.on_architecture(::CUDAGPU, a::StepRangeLen) = a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the long run this ought to be adapt(backend, a)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that would be a big negative diff to combine on_architecture
and adapt
src/arch_cuda.jl
Outdated
|
||
AC.architecture(::CuArray) = CUDAGPU() | ||
AC.architecture(::CuSparseMatrixCSC) = AC.GPU() | ||
AC.architecture(::SparseMatrixCSC) = AC.CPU() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, the CPU method goes in Architectures
src/arch_cuda.jl
Outdated
AC.architecture(::CuArray) = CUDAGPU() | ||
AC.architecture(::CuSparseMatrixCSC) = AC.GPU() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually I would like to see this be just get_backend
Possibly, we should simply implement a CUDA extension in this PR with appropriate organization of the code and get on with the breaking change! tl;dr then after this is merged, anybody doing computations on nvidia GPU has to write using Oceananigans
using CUDA |
@simone-silvestri curious to hear your thoughts |
I think it's a good idea. It provides templates to add new architectures and makes the code completely architecture agnostic. the extra |
69eb545
to
59a441d
Compare
src/Utils/versioninfo.jl
Outdated
@@ -4,7 +4,7 @@ using Oceananigans.Architectures | |||
|
|||
function versioninfo_with_gpu() | |||
s = sprint(versioninfo) | |||
if CUDA.has_cuda() | |||
if isdefined(Main, :CUDABackend) | |||
gpu_name = CUDA.CuDevice(0) |> CUDA.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this will fail, since CUDA is not defined.
Solving this is tricky, but most likely you will want to define a callback that the extension registers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should change how this function works; it should accept the architecture. Right now it is only used for NetCDF output writing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to fix this, let me know what to do
* versioninfo * dispatch fixes * Enable tests * Field broadcast fix
This PR isolates CUDA into
src/arch_cuda.jl
. This removes any direct CUDA calls in the remaining Oceananigans code base. That feel can either serve as a template for a new GPU architecture or for a future CUDA extension. @vchuravy