Description
Describe the bug
I am trying to use the CUSOLVER Xgesvdp!
method, which computes the singular value decomposition via polar decomposition and symmetric eigenvalue decomposition of the positive factor. It works when requesting both singular values and singular vectors, but not when only requesting singular values. I did not manage to decipher what is causing this bug:
To reproduce
The Minimal Working Example (MWE) for this bug:
julia> using CUDA, LinearAlgebra
julia> A = CUDA.randn(Float64, 10, 10);
julia> U, S, V, err = CUDA.CUSOLVER.Xgesvdp!('V', 0, copy(A));
julia> A ≈ U * Diagonal(S) * V' # succes!
true
julia> S, err = CUDA.CUSOLVER.Xgesvdp!('N', 1, copy(A));
ERROR: CUSOLVERError: an invalid value was used as an argument (code 3, CUSOLVER_STATUS_INVALID_VALUE)
Stacktrace:
[1] throw_api_error(res::CUDA.CUSOLVER.cusolverStatus_t)
@ CUDA.CUSOLVER ~/.julia/packages/CUDA/oymHm/lib/cusolver/libcusolver.jl:14
[2] check
@ ~/.julia/packages/CUDA/oymHm/lib/cusolver/libcusolver.jl:27 [inlined]
[3] cusolverDnXgesvdp
@ ~/.julia/packages/GPUToolbox/cZlg7/src/ccalls.jl:33 [inlined]
[4] (::CUDA.CUSOLVER.var"#1476#1478"{…})(buffer_gpu::CuArray{…}, buffer_cpu::Vector{…})
@ CUDA.CUSOLVER ~/.julia/packages/CUDA/oymHm/lib/cusolver/dense_generic.jl:326
[5] with_workspaces(f::CUDA.CUSOLVER.var"#1476#1478"{…}, cache_gpu::CuArray{…}, cache_cpu::Vector{…}, size_gpu::UInt64, size_cpu::UInt64)
@ CUDA.APIUtils ~/.julia/packages/CUDA/oymHm/lib/utils/call.jl:89
[6] Xgesvdp!(jobz::Char, econ::Int64, A::CuArray{Float64, 2, CUDA.DeviceMemory})
@ CUDA.CUSOLVER ~/.julia/packages/CUDA/oymHm/lib/cusolver/dense_generic.jl:325
[7] top-level scope
@ REPL[17]:1
Some type information was truncated. Use `show(err)` to see complete types.
julia> S, err = CUDA.CUSOLVER.Xgesvdp!('N', 0, copy(A));
ERROR: CUSOLVERError: an invalid value was used as an argument (code 3, CUSOLVER_STATUS_INVALID_VALUE)
Stacktrace:
[1] throw_api_error(res::CUDA.CUSOLVER.cusolverStatus_t)
@ CUDA.CUSOLVER ~/.julia/packages/CUDA/oymHm/lib/cusolver/libcusolver.jl:14
[2] check
@ ~/.julia/packages/CUDA/oymHm/lib/cusolver/libcusolver.jl:27 [inlined]
[3] cusolverDnXgesvdp
@ ~/.julia/packages/GPUToolbox/cZlg7/src/ccalls.jl:33 [inlined]
[4] (::CUDA.CUSOLVER.var"#1476#1478"{…})(buffer_gpu::CuArray{…}, buffer_cpu::Vector{…})
@ CUDA.CUSOLVER ~/.julia/packages/CUDA/oymHm/lib/cusolver/dense_generic.jl:326
[5] with_workspaces(f::CUDA.CUSOLVER.var"#1476#1478"{…}, cache_gpu::CuArray{…}, cache_cpu::Vector{…}, size_gpu::UInt64, size_cpu::UInt64)
@ CUDA.APIUtils ~/.julia/packages/CUDA/oymHm/lib/utils/call.jl:89
[6] Xgesvdp!(jobz::Char, econ::Int64, A::CuArray{Float64, 2, CUDA.DeviceMemory})
@ CUDA.CUSOLVER ~/.julia/packages/CUDA/oymHm/lib/cusolver/dense_generic.jl:325
[7] top-level scope
@ REPL[18]:1
Some type information was truncated. Use `show(err)` to see complete types.
Expected behavior
Get a vector of singular values, and some error measure intrinsic to this method.
Version info
Details on CUDA:
julia> CUDA.versioninfo()
CUDA runtime 12.8, artifact installation
CUDA driver 12.4
NVIDIA driver 550.120.0
CUDA libraries:
- CUBLAS: 12.8.4
- CURAND: 10.3.9
- CUFFT: 11.3.3
- CUSOLVER: 11.7.3
- CUSPARSE: 12.5.8
- CUPTI: 2025.1.1 (API 26.0.0)
- NVML: 12.0.0+550.120
Julia packages:
- CUDA: 5.7.3
- CUDA_Driver_jll: 0.12.1+1
- CUDA_Runtime_jll: 0.16.1+0
Toolchain:
- Julia: 1.11.5
- LLVM: 16.0.6
1 device:
0: NVIDIA TITAN V (sm_70, 11.351 GiB / 12.000 GiB available)
Details on Julia:
julia> versioninfo()
Julia Version 1.11.5
Commit 760b2e5b739 (2025-04-14 06:53 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 20 × Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, skylake-avx512)
Threads: 1 default, 0 interactive, 1 GC (on 20 virtual cores)