Skip to content

Manual union splitting past limits leads to incorrect codegen #1385

Open
@maleadt

Description

@maleadt

The following code fails to correctly iterate the elements of the array:

using CUDA

struct Box{T<:AbstractFloat}
    x::T
    y::T
    z::T
end
struct Sphere{T<:AbstractFloat}
    r::T
end
struct Tube{T<:AbstractFloat}
    r::T
    z::T
end
struct Cone{T<:AbstractFloat}
    r::T
    z::T
end

volume(b::Box{T}) where T = b.x * b.y * b.z
volume(s::Sphere{T}) where T = T(4)/3 * π * s.r^3
volume(t::Tube{T}) where T = T(π) * t.r^2 * t.z
volume(c::Cone{T}) where T = T(1)/3 * π * c.r^2 * c.z

function kernel(::Type{T}, shapes) where {T}
    for s in shapes
        if s isa Box{Float32}
            @cuprintln "Box: $(volume(s))"
        elseif s isa Sphere{T}
            @cuprintln "Sphere: $(volume(s))"
        elseif s isa Tube{T}
            @cuprintln "Tube: $(volume(s))"
        elseif s isa Cone{T}
            @cuprintln "Cone: $(volume(s))"
        else
            @cuprintln "Unknown shape"
        end
    end
    return nothing
end

function main(T=Float32)
    shapes = Vector{Union{Box{T}, Sphere{T}, Tube{T}, Cone{T}}}()
    #shapes = Vector{Union{Box{T}, Sphere{T}, Tube{T}}}()
    push!(shapes, Box{T}(1,2,3))
    push!(shapes, Sphere{T}(1))

    cu_shapes = CuVector(shapes)
    @cuda kernel(T, cu_shapes)
end

It is related to the union splitting limit of 3 -- uncommenting the alternative shapes allocation results in only 3 element types and the generated code being correct. Note that the manual splitting is required because of the limit, but even doing so the code still contains allocations, so this isn't a viable pattern for GPU programming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcuda kernelsStuff about writing CUDA kernels.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions