Description
I'm wondering if it makes sense to add something where you can launch kernels by calling a function (called say cuda
) rather than using the @cuda
macro? Or maybe this has already been discussed elsewhere? I can see two reasons for it:
-
Being a macro,
@cuda
must exist at compile time, whereas a function need only exist at run-time. The former can be restrictive if (like me) you write packages which lazily load CUDA-specific code via Requires.jl so that CUDA is optional for users. In this case, you cannot use@cuda
outside of the lazily loaded code even if your kernel code is otherwise generic, but with a function you could do that. -
It could let you write some nice Julian-looking CUDA calls, e.g.
x = CuArrays.fill(1f0,10)
cuda(x) do x
for I in eachindex(x)
x[I] += 1
end
end
Another advantage is that the implementation seems trivial, something like:
function cuda(f, args...; threads=256)
@cuda threads=threads f(args...)
end
I could well be missing a reason why this won't work, but if not, let me know, I could take a stab at a PR.