Skip to content

SVectors on GPU CuArray cannot use index from CuArray #507

Open
@AhmedSalih3d

Description

@AhmedSalih3d

Hi!

Using the package flux I want to scatter the following using NNlib:

using Flux

NNlib.scatter(+, [SVector(1,1,1),SVector(1,1,1),SVector(1,1,1)], [3,1,2])

3-element Vector{SVector{3, Int64}}:
 [1, 1, 1]
 [1, 1, 1]
 [1, 1, 1]

Which works no problem. If I change the mid array to CuArray, then it works again, but tested that it is slow for large arrays (60k):

NNlib.scatter(+, CuArray([SVector(1,1,1),SVector(1,1,1),SVector(1,1,1)]), [3,1,2])

3-element CuArray{SVector{3, Int64}, 1, CUDA.Mem.DeviceBuffer}:
 [1, 1, 1]
 [1, 1, 1]
 [1, 1, 1]

If I try to do everything on GPU:

NNlib.scatter(+, CuArray([SVector(1,1,1),SVector(1,1,1),SVector(1,1,1)]), CuArray([3,1,2]))

ERROR: InvalidIRError: compiling kernel #scatter_kernel!(typeof(+), CuDeviceVector{SVector{3, Int64}, 1}, CuDeviceVector{SVector{3, Int64}, 1}, CuDeviceVector{Int64, 1}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to atomic_cas!)

Which I think is an error?

More info: https://discourse.julialang.org/t/how-to-reduce-an-array/92945/14

Kind regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions