Support Julia 1.13 by eschnett · Pull Request #3020 · JuliaGPU/CUDA.jl

eschnett · 2026-01-19T16:52:19Z

Closes #3019.

github-actions

CUDA.jl Benchmarks

Details

Benchmark suite	Current: `4c8e4e2`	Previous: `e2eab84`	Ratio
`latency/precompile`	`44846190635.5` ns	`55351775865` ns	`0.81`
`latency/ttfp`	`14453150957` ns	`7675959782` ns	`1.88`
`latency/import`	`4520087896` ns	`4025008043` ns	`1.12`
`integration/volumerhs`	`9443988.5` ns	`9625265.5` ns	`0.98`
`integration/byval/slices=1`	`145635.5` ns	`147037` ns	`0.99`
`integration/byval/slices=3`	`422956.5` ns	`426089` ns	`0.99`
`integration/byval/reference`	`143817` ns	`144983` ns	`0.99`
`integration/byval/slices=2`	`284679` ns	`286492` ns	`0.99`
`integration/cudadevrt`	`102555` ns	`103531` ns	`0.99`
`kernel/indexing`	`13575` ns	`14161` ns	`0.96`
`kernel/indexing_checked`	`14160` ns	`15039` ns	`0.94`
`kernel/occupancy`	`683.5454545454545` ns	`792.6237623762377` ns	`0.86`
`kernel/launch`	`2105.1` ns	`2304.777777777778` ns	`0.91`
`kernel/rand`	`14507` ns	`18605.5` ns	`0.78`
`array/reverse/1d`	`19160` ns	`19692` ns	`0.97`
`array/reverse/2dL_inplace`	`66497` ns	`66886` ns	`0.99`
`array/reverse/1dL`	`69339` ns	`69816` ns	`0.99`
`array/reverse/2d`	`21029.5` ns	`21785` ns	`0.97`
`array/reverse/1d_inplace`	`10596.666666666666` ns	`9783` ns	`1.08`
`array/reverse/2d_inplace`	`11915` ns	`13411` ns	`0.89`
`array/reverse/2dL`	`73024.5` ns	`73680` ns	`0.99`
`array/reverse/1dL_inplace`	`66420` ns	`66877` ns	`0.99`
`array/copy`	`18483` ns	`20494` ns	`0.90`
`array/iteration/findall/int`	`146058` ns	`159432` ns	`0.92`
`array/iteration/findall/bool`	`130524` ns	`141132` ns	`0.92`
`array/iteration/findfirst/int`	`84706` ns	`161404` ns	`0.52`
`array/iteration/findfirst/bool`	`82020` ns	`162399` ns	`0.51`
`array/iteration/scalar`	`65977.5` ns	`73008` ns	`0.90`
`array/iteration/logical`	`198658` ns	`220542` ns	`0.90`
`array/iteration/findmin/1d`	`82559` ns	`94388` ns	`0.87`
`array/iteration/findmin/2d`	`116988` ns	`121456` ns	`0.96`
`array/reductions/reduce/Int64/1d`	`39212` ns	`43817` ns	`0.89`
`array/reductions/reduce/Int64/dims=1`	`42352` ns	`44722` ns	`0.95`
`array/reductions/reduce/Int64/dims=2`	`59082` ns	`61524.5` ns	`0.96`
`array/reductions/reduce/Int64/dims=1L`	`87201` ns	`88932` ns	`0.98`
`array/reductions/reduce/Int64/dims=2L`	`84775.5` ns	`88232` ns	`0.96`
`array/reductions/reduce/Float32/1d`	`34234.5` ns	`37401.5` ns	`0.92`
`array/reductions/reduce/Float32/dims=1`	`40281.5` ns	`51945.5` ns	`0.78`
`array/reductions/reduce/Float32/dims=2`	`56532.5` ns	`59868` ns	`0.94`
`array/reductions/reduce/Float32/dims=1L`	`51686` ns	`52569` ns	`0.98`
`array/reductions/reduce/Float32/dims=2L`	`70130.5` ns	`72250` ns	`0.97`
`array/reductions/mapreduce/Int64/1d`	`39390` ns	`43757` ns	`0.90`
`array/reductions/mapreduce/Int64/dims=1`	`49646.5` ns	`51077` ns	`0.97`
`array/reductions/mapreduce/Int64/dims=2`	`59329` ns	`61716` ns	`0.96`
`array/reductions/mapreduce/Int64/dims=1L`	`87259` ns	`89026` ns	`0.98`
`array/reductions/mapreduce/Int64/dims=2L`	`84900.5` ns	`88266` ns	`0.96`
`array/reductions/mapreduce/Float32/1d`	`34051` ns	`37130` ns	`0.92`
`array/reductions/mapreduce/Float32/dims=1`	`45578` ns	`42011` ns	`1.08`
`array/reductions/mapreduce/Float32/dims=2`	`56455` ns	`60008` ns	`0.94`
`array/reductions/mapreduce/Float32/dims=1L`	`51769` ns	`52656` ns	`0.98`
`array/reductions/mapreduce/Float32/dims=2L`	`69321` ns	`72333.5` ns	`0.96`
`array/broadcast`	`20561` ns	`20154` ns	`1.02`
`array/copyto!/gpu_to_gpu`	`10673.166666666668` ns	`12828` ns	`0.83`
`array/copyto!/cpu_to_gpu`	`218524` ns	`216999` ns	`1.01`
`array/copyto!/gpu_to_cpu`	`284623` ns	`282452` ns	`1.01`
`array/accumulate/Int64/1d`	`118384` ns	`125219` ns	`0.95`
`array/accumulate/Int64/dims=1`	`79371` ns	`88138` ns	`0.90`
`array/accumulate/Int64/dims=2`	`155465` ns	`162165` ns	`0.96`
`array/accumulate/Int64/dims=1L`	`1694474` ns	`1714037` ns	`0.99`
`array/accumulate/Int64/dims=2L`	`960497` ns	`971078.5` ns	`0.99`
`array/accumulate/Float32/1d`	`100226` ns	`109882.5` ns	`0.91`
`array/accumulate/Float32/dims=1`	`76031` ns	`84424` ns	`0.90`
`array/accumulate/Float32/dims=2`	`144406.5` ns	`151744.5` ns	`0.95`
`array/accumulate/Float32/dims=1L`	`1585870` ns	`1622762` ns	`0.98`
`array/accumulate/Float32/dims=2L`	`656780.5` ns	`702590.5` ns	`0.93`
`array/construct`	`1265.3` ns	`1267.85` ns	`1.00`
`array/random/randn/Float32`	`36574` ns	`48207` ns	`0.76`
`array/random/randn!/Float32`	`30409` ns	`25000` ns	`1.22`
`array/random/rand!/Int64`	`34484` ns	`27295` ns	`1.26`
`array/random/rand!/Float32`	`8136.75` ns	`8737.333333333334` ns	`0.93`
`array/random/rand/Int64`	`36946` ns	`30000` ns	`1.23`
`array/random/rand/Float32`	`12492` ns	`13155` ns	`0.95`
`array/permutedims/4d`	`51599` ns	`55023` ns	`0.94`
`array/permutedims/2d`	`52539.5` ns	`53878` ns	`0.98`
`array/permutedims/3d`	`52918` ns	`54959` ns	`0.96`
`array/sorting/1d`	`2736673` ns	`2758315` ns	`0.99`
`array/sorting/by`	`3306442` ns	`3344753.5` ns	`0.99`
`array/sorting/2d`	`1068592` ns	`1081270` ns	`0.99`
`cuda/synchronization/stream/auto`	`974.5` ns	`1017.0833333333334` ns	`0.96`
`cuda/synchronization/stream/nonblocking`	`6809.700000000001` ns	`7295.6` ns	`0.93`
`cuda/synchronization/stream/blocking`	`816.5108695652174` ns	`798.6601941747573` ns	`1.02`
`cuda/synchronization/context/auto`	`1159.2` ns	`1158.8` ns	`1.00`
`cuda/synchronization/context/nonblocking`	`7289.8` ns	`7658.9` ns	`0.95`
`cuda/synchronization/context/blocking`	`894.8541666666666` ns	`902.5217391304348` ns	`0.99`

This comment was automatically generated by workflow using github-action-benchmark.

eschnett · 2026-01-19T18:54:02Z

The self-tests fail because the linear algebra functions (e.g. matrix exponential) as implemented in LinearAlgebra use scalar iteration. See e.g. exp! in https://github.com/JuliaLang/LinearAlgebra.jl/blob/f55e4736fb6dce08fee8a7ac7f0aba1f2b54838e/src/dense.jl#L784.

How should this be handled? Rewrite exp!? Find a respective CUDA library function to call and add a new method to exp? Fall back to the Julia 1.12 implementation? How does this work in Julia 1.12?

eschnett · 2026-01-19T20:03:54Z

I think it's JuliaGPU/GPUArrays.jl#679.

eschnett · 2026-02-03T13:02:14Z

The buildkite error is

  ptxas /tmp/jl_PALmvKnqta.ptx, line 226; error   : Modifier '.NaN' requires .target sm_80 or higher
  ptxas /tmp/jl_PALmvKnqta.ptx, line 226; error   : Feature 'min.f16 or min.f16x2' requires .target sm_80 or higher

This seems unrelated to my changes, except that I am now running CI tests on Julia 1.12 and Julia 1.13...

maleadt · 2026-02-04T10:42:10Z

I guess #3025 needs to be active for all LLVM versions.

eschnett · 2026-02-04T14:47:47Z

Good news: CUDA.jl now works for Julia 1.12.
Bad news: There's an LLVM segfault for Julia 1.13.

�_bk;t=1770145810814�      From worker 5:	[271397] signal 11 (1): Segmentation fault
�_bk;t=1770145810814�      From worker 5:	in expression starting at /var/lib/buildkite-agent/builds/gpuci-9/julialang/cuda-dot-jl/test/base/texture.jl:41
�_bk;t=1770145810920�      From worker 5:	_ZN12_GLOBAL__N_124NVPTXReplaceImageHandles18findIndexForHandleERN4llvm14MachineOperandERNS1_15MachineFunctionERj.isra.0 at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810920�      From worker 5:	_ZN12_GLOBAL__N_124NVPTXReplaceImageHandles18findIndexForHandleERN4llvm14MachineOperandERNS1_15MachineFunctionERj.isra.0 at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810920�      From worker 5:	_ZN12_GLOBAL__N_124NVPTXReplaceImageHandles20runOnMachineFunctionERN4llvm15MachineFunctionE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810921�      From worker 5:	_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE.part.0 at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810921�      From worker 5:	_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810921�      From worker 5:	_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810922�      From worker 5:	_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810922�      From worker 5:	_ZL21LLVMTargetMachineEmitP23LLVMOpaqueTargetMachineP16LLVMOpaqueModuleRN4llvm17raw_pwrite_streamE19LLVMCodeGenFileTypePPc at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)

eschnett · 2026-02-04T15:57:50Z

I think it's texture interpolation that is broken on 1.13. This line segfaults LLVM:

dst[i] = texture[u]

in test/base/texture.jl (function kernel_texture_warp_native).

eschnett · 2026-02-04T17:35:53Z

We will need to update KernelAbstractions.jl as well JuliaGPU/KernelAbstractions.jl#679.

.buildkite/pipeline.yml

Support Julia 1.13

d85b327

github-actions bot reviewed Jan 19, 2026

View reviewed changes

eschnett added 2 commits February 2, 2026 16:28

Require GPUArrays 11.4.0

7484546

Merge branch 'master' into eschnett/julia-1.13

e77db5e

Complete workaround to avoid .NaN modifiers on LLVM 18. (JuliaGPU#3025)

6ed4beb

docs: Update manifest for Julia 1.12

9880744

More fixes for Julia 1.13

4c8e4e2

christiangnrd reviewed Feb 4, 2026

View reviewed changes

.buildkite/pipeline.yml Outdated Show resolved Hide resolved

CI: Undo change to benchmarked CUDA versino

8bd53cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Julia 1.13#3020

Support Julia 1.13#3020
eschnett wants to merge 7 commits intoJuliaGPU:masterfrom
eschnett:eschnett/julia-1.13

eschnett commented Jan 19, 2026

Uh oh!

github-actions bot left a comment •

edited

Loading

Uh oh!

eschnett commented Jan 19, 2026

Uh oh!

eschnett commented Jan 19, 2026

Uh oh!

eschnett commented Feb 3, 2026

Uh oh!

maleadt commented Feb 4, 2026

Uh oh!

eschnett commented Feb 4, 2026

Uh oh!

eschnett commented Feb 4, 2026

Uh oh!

eschnett commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

eschnett commented Jan 19, 2026

Uh oh!

github-actions bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

CUDA.jl Benchmarks

Uh oh!

eschnett commented Jan 19, 2026

Uh oh!

eschnett commented Jan 19, 2026

Uh oh!

eschnett commented Feb 3, 2026

Uh oh!

maleadt commented Feb 4, 2026

Uh oh!

eschnett commented Feb 4, 2026

Uh oh!

eschnett commented Feb 4, 2026

Uh oh!

eschnett commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot left a comment •

edited

Loading