-
Notifications
You must be signed in to change notification settings - Fork 263
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I'm running the Arnoldi method to find a few eigenvalues of a CUDA sparse matrix. It is a simple iterative algorithm that only performs matrix-vector multiplications and vector-vector dot products for the Grahm-Schmitd decomposition.
I have noticed that sometimes it raises this error
nested task error: CUDA error: invalid argument (code 1, ERROR_INVALID_VALUE)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:30
[2] check
@ ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:37 [inlined]
[3] cuMemcpyDtoHAsync_v2
@ ~/.julia/packages/GPUToolbox/JLBB1/src/ccalls.jl:33 [inlined]
[4] #unsafe_copyto!#497
@ ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/memory.jl:414 [inlined]
[5] getindex
@ ~/.julia/packages/CUDA/x8d2s/src/refpointer.jl:84 [inlined]
[6] dotc(n::Int64, x::CuArray{ComplexF64, 1, CUDA.DeviceMemory}, y::CuArray{ComplexF64, 1, CUDA.DeviceMemory})
@ CUDA.CUBLAS ~/.julia/packages/CUDA/x8d2s/lib/cublas/wrappers.jl:195
[7] dot
@ ~/.julia/packages/CUDA/x8d2s/lib/cublas/linalg.jl:36 [inlined]
[8] arnoldi_step!(A::ArnoldiLindbladIntegratorMap{…}, V::CuArray{…}, H::Matrix{…}, i::Int64)
@ QuantumToolbox ~/.julia/packages/QuantumToolbox/FDF2J/src/arnoldi.jl:38
[9] _eigsolve(A::ArnoldiLindbladIntegratorMap{…}, b::CuArray{…}, type::SuperOperator, dimensions::Dimensions{…}, k::Int64, m::Int64; tol::Float64, maxiter::Int64, sortby::Function, rev::Bool)
@ QuantumToolbox ~/.julia/packages/QuantumToolbox/FDF2J/src/qobj/eigsolve.jl:250
[10] _eigsolve
@ ~/.julia/packages/QuantumToolbox/FDF2J/src/qobj/eigsolve.jl:183 [inlinedTo reproduce
It doesn't happen every time, but only in a few occasions. But when it happens, it breaks hours of calculations. So I don't have a specific MWE. I can paste here the code I'm using with QuantumToolbox.jl if you want.
The error should be related to only dot. Perhaps just performing this operation many many times could raise this error at some point.
Expected behavior
No errors.
Version info
Details on Julia:
Julia Version 1.12.3
Commit 966d0af0fdf (2025-12-15 11:20 UTC)
Build Info:
Official https://julialang.org release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13900KF
WORD_SIZE: 64
LLVM: libLLVM-18.1.7 (ORCJIT, alderlake)
GC: Built with stock GC
Threads: 16 default, 1 interactive, 16 GC (on 32 virtual cores)
Environment:
JULIA_NUM_THREADS = 16
Details on CUDA:
Julia Version 1.12.3
Commit 966d0af0fdf (2025-12-15 11:20 UTC)
Build Info:
Official https://julialang.org release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13900KF
WORD_SIZE: 64
LLVM: libLLVM-18.1.7 (ORCJIT, alderlake)
GC: Built with stock GC
Threads: 16 default, 1 interactive, 16 GC (on 32 virtual cores)
Environment:
JULIA_NUM_THREADS = 16
julia> CUDA.versioninfo()
CUDA toolchain:
- runtime 13.0, artifact installation
- driver 580.95.5 for 13.0
- compiler 13.1
CUDA libraries:
- CUBLAS: 13.1.0
- CURAND: 10.4.0
- CUFFT: 12.0.0
- CUSOLVER: 12.0.4
- CUSPARSE: 12.6.3
- CUPTI: 2025.3.1 (API 13.0.1)
- NVML: 13.0.0+580.95.5
Julia packages:
- CUDA: 5.9.5
- CUDA_Driver_jll: 13.1.0+0
- CUDA_Compiler_jll: 0.3.1+1
- CUDA_Runtime_jll: 0.19.2+0
Toolchain:
- Julia: 1.12.3
- LLVM: 18.1.7
1 device:
0: NVIDIA GeForce RTX 4090 (sm_89, 22.557 GiB / 23.988 GiB available)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working