Open
Description
Describe the bug
I've been experiencing a subtle bug in CUDA.jl related to tuples of sparse matrices which have 0 in one of their dimensions. I was finally able to condense it down to the MWE below.
To reproduce
The Minimal Working Example (MWE) for this bug:
julia> using CUDA, CUDA.CUSPARSE, SparseArrays
julia> CuSparseMatrixCSR{Float32}(sparse([], [], Array{Float32}([]))), CuArray{Float32}([1.0, 1.0, 1.0])
(Error showing value of type Tuple{CuSparseMatrixCSR{Float32, Int32}, CuArray{Float32, 1, CUDA.DeviceMemory}}:
ERROR: ArgumentError: 1 == colptr[1] != 1
Expected behavior
Using SparseArrays, we get:
julia> SparseMatrixCSC{Float32}(sparse([], [], Array{Float32}([]))), Array{Float32}([1.0, 1.0, 1.0])
(sparse(Int64[], Int64[], Float32[], 0, 0), Float32[1.0, 1.0, 1.0])
Using AMDGPU, we get:
julia> ROCSparseMatrixCSR{Float32}(sparse([], [], Array{Float32}([]))), ROCArray{Float32}([1.0, 1.0, 1.0])
(sparse(Int32[], Int32[], Float32[], 0, 0), Float32[1.0, 1.0, 1.0])
This seems like the correct behavior, and is what I was expecting.
Version info
Details on Julia:
Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 × AMD Ryzen Threadripper PRO 5975WX 32-Cores
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)
Environment:
Details on CUDA:
julia> CUDA.versioninfo()
CUDA runtime 12.6, artifact installation
CUDA driver 12.5
NVIDIA driver 555.42.6
CUDA libraries:
- CUBLAS: 12.6.4
- CURAND: 10.3.7
- CUFFT: 11.3.0
- CUSOLVER: 11.7.1
- CUSPARSE: 12.5.4
- CUPTI: 2024.3.2 (API 24.0.0)
- NVML: 12.0.0+555.42.6
Julia packages:
- CUDA: 5.5.2
- CUDA_Driver_jll: 0.10.4+0
- CUDA_Runtime_jll: 0.15.5+0
Toolchain:
- Julia: 1.11.1
- LLVM: 16.0.6
1 device:
0: NVIDIA TITAN V (sm_70, 11.432 GiB / 12.000 GiB available)