Open
Description
CUDA GPUs do not natively support Int128 operations. LLVM supports lowering code that works with Int128, https://reviews.llvm.org/rGb9fc48da832654a2b486adaa790ceaa6dba94455, but requires compiler intrinsics for many operations:
julia> using CUDA
julia> x = widen.(CuArray(rand(Int64, 10)))
10-element CuArray{Int128, 1}:
...
julia> .÷(x, x)
ERROR: LLVM error: Undefined external symbol "__divti3"
With https://reviews.llvm.org/D34708, it should be possible to resolve those intrinsics in the current module, so we can just add them to our runtime library.