-
Notifications
You must be signed in to change notification settings - Fork 263
Description
When using map! with a conditional on a CuArray there is an error when using a ternary operator but not when multiplying by the result of the conditional on the most recent CUDA.jl version. There are also no issues using map! with these inputs without the conditional. Below are the three implementations of map! I've used. I wasn't able to replicate this issue on a simplified version of this code, which I've also included. Any help is appreciated, thanks.
Setup for the following functions:
All variables are CUDA variables
lnA, Q, n, R = dynamics.lnA, dynamics.E, dynamics.n, dynamics.R
dt = Δt(dynamics, uₖ)
A_diag = @view A[1:(size(A, 1)+1):end]map! without conditional (no errors):
map!((yₖ, Tₖ) -> 1.0 / (1.0 + exp(lnA - Q / R / Tₖ - yₖ / n) * dt), A_diag, xₖ, uₖ)map! with conditional and ternary operator:
Error: CuError(CUDA.cudaError_enum(0x000002bc))
Error: illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
map!((yₖ, Tₖ) -> Tₖ > 1.0 ? 0.0f0 : 1.0 / (1.0 + exp(lnA - Q / R / Tₖ - yₖ / n) * dt), A_diag, xₖ, uₖ)map! multiplying by conditional (no errors):
map!((yₖ, Tₖ) -> (Tₖ < 1.0) * 1.0 / (1.0 + exp(lnA - Q / R / Tₖ - yₖ / n) * dt), A_diag, xₖ, uₖ)I was unable to replicate this issue with a simplified version of this code, so I included my full implementation. Below is a simplified version of my code that works fine for both conditional map! implementations:
using CUDA
A = cu(ones(50))
B = cu(1:50)
C = cu(zeros(50, 100))
D = cu(zeros(50, 100))
C_diag = @view C[1:(size(C, 1)+1):end]
D_diag = @view D[1:(size(D, 1)+1):end]
map!((a, b) -> (b < 25.0) * a, C_diag, A, B)
@show C_diag
map!((a, b) -> b > 25.0 ? 0 : a, D_diag, A, B)
@show D_diagDetails on Julia:
Julia Version 1.11.4
Commit 8561cc3d68d (2025-03-10 11:36 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 32 × AMD Ryzen Threadripper PRO 5955WX 16-Cores
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 32 virtual cores)
Details on CUDA:
CUDA runtime 12.8, artifact installation
CUDA driver 12.6
NVIDIA driver 560.35.3
CUDA libraries:
- CUBLAS: 12.8.4
- CURAND: 10.3.9
- CUFFT: 11.3.3
- CUSOLVER: 11.7.3
- CUSPARSE: 12.5.8
- CUPTI: 2025.1.1 (API 26.0.0)
- NVML: 12.0.0+560.35.3
Julia packages:
- CUDA: 5.7.2
- CUDA_Driver_jll: 0.12.1+1
- CUDA_Runtime_jll: 0.16.1+0
Toolchain:
- Julia: 1.11.4
- LLVM: 16.0.6
Preferences:
- default_memory: device
2 devices:
0: NVIDIA GeForce RTX 4090 (sm_89, 3.061 GiB / 23.988 GiB available)
1: NVIDIA GeForce RTX 4090 (sm_89, 22.936 GiB / 23.988 GiB available)