Conversation
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: 4c8e4e2 | Previous: e2eab84 | Ratio |
|---|---|---|---|
latency/precompile |
44846190635.5 ns |
55351775865 ns |
0.81 |
latency/ttfp |
14453150957 ns |
7675959782 ns |
1.88 |
latency/import |
4520087896 ns |
4025008043 ns |
1.12 |
integration/volumerhs |
9443988.5 ns |
9625265.5 ns |
0.98 |
integration/byval/slices=1 |
145635.5 ns |
147037 ns |
0.99 |
integration/byval/slices=3 |
422956.5 ns |
426089 ns |
0.99 |
integration/byval/reference |
143817 ns |
144983 ns |
0.99 |
integration/byval/slices=2 |
284679 ns |
286492 ns |
0.99 |
integration/cudadevrt |
102555 ns |
103531 ns |
0.99 |
kernel/indexing |
13575 ns |
14161 ns |
0.96 |
kernel/indexing_checked |
14160 ns |
15039 ns |
0.94 |
kernel/occupancy |
683.5454545454545 ns |
792.6237623762377 ns |
0.86 |
kernel/launch |
2105.1 ns |
2304.777777777778 ns |
0.91 |
kernel/rand |
14507 ns |
18605.5 ns |
0.78 |
array/reverse/1d |
19160 ns |
19692 ns |
0.97 |
array/reverse/2dL_inplace |
66497 ns |
66886 ns |
0.99 |
array/reverse/1dL |
69339 ns |
69816 ns |
0.99 |
array/reverse/2d |
21029.5 ns |
21785 ns |
0.97 |
array/reverse/1d_inplace |
10596.666666666666 ns |
9783 ns |
1.08 |
array/reverse/2d_inplace |
11915 ns |
13411 ns |
0.89 |
array/reverse/2dL |
73024.5 ns |
73680 ns |
0.99 |
array/reverse/1dL_inplace |
66420 ns |
66877 ns |
0.99 |
array/copy |
18483 ns |
20494 ns |
0.90 |
array/iteration/findall/int |
146058 ns |
159432 ns |
0.92 |
array/iteration/findall/bool |
130524 ns |
141132 ns |
0.92 |
array/iteration/findfirst/int |
84706 ns |
161404 ns |
0.52 |
array/iteration/findfirst/bool |
82020 ns |
162399 ns |
0.51 |
array/iteration/scalar |
65977.5 ns |
73008 ns |
0.90 |
array/iteration/logical |
198658 ns |
220542 ns |
0.90 |
array/iteration/findmin/1d |
82559 ns |
94388 ns |
0.87 |
array/iteration/findmin/2d |
116988 ns |
121456 ns |
0.96 |
array/reductions/reduce/Int64/1d |
39212 ns |
43817 ns |
0.89 |
array/reductions/reduce/Int64/dims=1 |
42352 ns |
44722 ns |
0.95 |
array/reductions/reduce/Int64/dims=2 |
59082 ns |
61524.5 ns |
0.96 |
array/reductions/reduce/Int64/dims=1L |
87201 ns |
88932 ns |
0.98 |
array/reductions/reduce/Int64/dims=2L |
84775.5 ns |
88232 ns |
0.96 |
array/reductions/reduce/Float32/1d |
34234.5 ns |
37401.5 ns |
0.92 |
array/reductions/reduce/Float32/dims=1 |
40281.5 ns |
51945.5 ns |
0.78 |
array/reductions/reduce/Float32/dims=2 |
56532.5 ns |
59868 ns |
0.94 |
array/reductions/reduce/Float32/dims=1L |
51686 ns |
52569 ns |
0.98 |
array/reductions/reduce/Float32/dims=2L |
70130.5 ns |
72250 ns |
0.97 |
array/reductions/mapreduce/Int64/1d |
39390 ns |
43757 ns |
0.90 |
array/reductions/mapreduce/Int64/dims=1 |
49646.5 ns |
51077 ns |
0.97 |
array/reductions/mapreduce/Int64/dims=2 |
59329 ns |
61716 ns |
0.96 |
array/reductions/mapreduce/Int64/dims=1L |
87259 ns |
89026 ns |
0.98 |
array/reductions/mapreduce/Int64/dims=2L |
84900.5 ns |
88266 ns |
0.96 |
array/reductions/mapreduce/Float32/1d |
34051 ns |
37130 ns |
0.92 |
array/reductions/mapreduce/Float32/dims=1 |
45578 ns |
42011 ns |
1.08 |
array/reductions/mapreduce/Float32/dims=2 |
56455 ns |
60008 ns |
0.94 |
array/reductions/mapreduce/Float32/dims=1L |
51769 ns |
52656 ns |
0.98 |
array/reductions/mapreduce/Float32/dims=2L |
69321 ns |
72333.5 ns |
0.96 |
array/broadcast |
20561 ns |
20154 ns |
1.02 |
array/copyto!/gpu_to_gpu |
10673.166666666668 ns |
12828 ns |
0.83 |
array/copyto!/cpu_to_gpu |
218524 ns |
216999 ns |
1.01 |
array/copyto!/gpu_to_cpu |
284623 ns |
282452 ns |
1.01 |
array/accumulate/Int64/1d |
118384 ns |
125219 ns |
0.95 |
array/accumulate/Int64/dims=1 |
79371 ns |
88138 ns |
0.90 |
array/accumulate/Int64/dims=2 |
155465 ns |
162165 ns |
0.96 |
array/accumulate/Int64/dims=1L |
1694474 ns |
1714037 ns |
0.99 |
array/accumulate/Int64/dims=2L |
960497 ns |
971078.5 ns |
0.99 |
array/accumulate/Float32/1d |
100226 ns |
109882.5 ns |
0.91 |
array/accumulate/Float32/dims=1 |
76031 ns |
84424 ns |
0.90 |
array/accumulate/Float32/dims=2 |
144406.5 ns |
151744.5 ns |
0.95 |
array/accumulate/Float32/dims=1L |
1585870 ns |
1622762 ns |
0.98 |
array/accumulate/Float32/dims=2L |
656780.5 ns |
702590.5 ns |
0.93 |
array/construct |
1265.3 ns |
1267.85 ns |
1.00 |
array/random/randn/Float32 |
36574 ns |
48207 ns |
0.76 |
array/random/randn!/Float32 |
30409 ns |
25000 ns |
1.22 |
array/random/rand!/Int64 |
34484 ns |
27295 ns |
1.26 |
array/random/rand!/Float32 |
8136.75 ns |
8737.333333333334 ns |
0.93 |
array/random/rand/Int64 |
36946 ns |
30000 ns |
1.23 |
array/random/rand/Float32 |
12492 ns |
13155 ns |
0.95 |
array/permutedims/4d |
51599 ns |
55023 ns |
0.94 |
array/permutedims/2d |
52539.5 ns |
53878 ns |
0.98 |
array/permutedims/3d |
52918 ns |
54959 ns |
0.96 |
array/sorting/1d |
2736673 ns |
2758315 ns |
0.99 |
array/sorting/by |
3306442 ns |
3344753.5 ns |
0.99 |
array/sorting/2d |
1068592 ns |
1081270 ns |
0.99 |
cuda/synchronization/stream/auto |
974.5 ns |
1017.0833333333334 ns |
0.96 |
cuda/synchronization/stream/nonblocking |
6809.700000000001 ns |
7295.6 ns |
0.93 |
cuda/synchronization/stream/blocking |
816.5108695652174 ns |
798.6601941747573 ns |
1.02 |
cuda/synchronization/context/auto |
1159.2 ns |
1158.8 ns |
1.00 |
cuda/synchronization/context/nonblocking |
7289.8 ns |
7658.9 ns |
0.95 |
cuda/synchronization/context/blocking |
894.8541666666666 ns |
902.5217391304348 ns |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
|
The self-tests fail because the linear algebra functions (e.g. matrix exponential) as implemented in How should this be handled? Rewrite |
|
I think it's JuliaGPU/GPUArrays.jl#679. |
|
The buildkite error is This seems unrelated to my changes, except that I am now running CI tests on Julia 1.12 and Julia 1.13... |
|
I guess #3025 needs to be active for all LLVM versions. |
|
Good news: CUDA.jl now works for Julia 1.12. |
|
I think it's texture interpolation that is broken on 1.13. This line segfaults LLVM: in |
|
We will need to update KernelAbstractions.jl as well JuliaGPU/KernelAbstractions.jl#679. |
Closes #3019.