Enable Ahead-of-Time Compilation by hiding the runtime functions in the `GLOBAL_METHOD_TABLE` #749

apozharski · 2025-12-22T15:34:00Z

As discussed in JuliaGPU/CUDA.jl#2998 and #611 currently GPUCompiler.jl leaks nonexistant gpu_* llvm functions into the cpu cache making ahead of time compilation impossible for any package that uses it.

I am currently fixing this by moving these runtime methods into the method table defined in the GPUCompiler module and having the CPU versions throw errors as is done in CUDA.jl. This feels like somewhat of a hack, however, it seems to work and without a better understanding of what this might break it seems to be the simplest solution.

…erlay

github-actions · 2025-12-22T15:34:59Z

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic master) to apply these changes.

Click here to view the suggested changes.

diff --git a/src/utils.jl b/src/utils.jl
index 8242e5a..9f55e9c 100644
--- a/src/utils.jl
+++ b/src/utils.jl
@@ -196,12 +196,14 @@ macro device_function(ex)
         error("This function is not intended for use on the CPU")
     end
 
-    esc(quote
-        $(combinedef(def))
+    return esc(
+        quote
+            $(combinedef(def))
 
-        # NOTE: no use of `@consistent_overlay` here because the regular function errors
-        Base.Experimental.@overlay($(GPUCompiler).GLOBAL_METHOD_TABLE, $ex)
-    end)
+            # NOTE: no use of `@consistent_overlay` here because the regular function errors
+            Base.Experimental.@overlay($(GPUCompiler).GLOBAL_METHOD_TABLE, $ex)
+        end
+    )
 end

KSepetanc · 2025-12-23T00:11:04Z

Loaded both forked CUDA.jl and this PR and tried to compile my full code and got error:

Stacktrace is massive so I copied first several lines.

ERROR: LoadError: Invalid return type for runtime function 'box_bool': expected LLVM.PointerType(ptr addrspace(10)), got LLVM.VoidType(void)
Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:44
  [2] emit_function!(mod::LLVM.Module, config::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, f::Type, method::GPUCompiler.Runtime.RuntimeMethodInstance)
    @ GPUCompiler C:\Users\karlo\.julia\packages\GPUCompiler\vRm9U\src\rtlib.jl:81
  [3] build_runtime(job::GPUCompiler.CompilerJob)
    @ GPUCompiler C:\Users\karlo\.julia\packages\GPUCompiler\vRm9U\src\rtlib.jl:117
  [4] (::GPUCompiler.var"#load_runtime##0#load_runtime##1"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}})()
    @ GPUCompiler C:\Users\karlo\.julia\packages\GPUCompiler\vRm9U\src\rtlib.jl:159
  [5] lock(f::GPUCompiler.var"#load_runtime##0#load_runtime##1"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}}, l::ReentrantLock)
    @ Base .\lock.jl:335

Otherwise, I could compile GPUCompiler in image.

Do you have idea where could this come from?

apozharski · 2025-12-23T11:49:19Z

@KSepetanc yep, I caught that in the tests for the PR as well (somehow they were passing for me locally, but I suspect that was just poor environment management on my part).

Unsurprisingly, my hack seems to break things in GPUCompiler at runtime. I have some ideas, namely the fact that i am replacing the stub llvm call with an exception in the cpu-cache which simply returns void. It has taken me a bit to get something I can test with (since my machine at home has a quite broken CUDA installation 😅 ) but it seems I am able to test with OpenCL so hopefully I can get something a bit less hacky working soon depending on how much time I will have over the holidays.

I will turn this PR to a draft until then.

…g us out of some KernelAbstractions compilations in e.g OpenCL.jl

KSepetanc · 2025-12-23T20:21:41Z

@apozharski are you using CUDA 590 driver branch (it is CUDA 13.1)? I have seen maintainers are preparing support for it, but last I checked a few days ago it still was not released. Without knowing more about your system, I presume you just need to downgrade to 580 series driver that comes with CUDA 13.0. I had this issue too.

I will soon have more questions as it seems that more fixes are needed than just GPUCompiler.jl and CUDA.jl to AOT compile MadNLPGPU which I need, but it is still WIP so I will wait a bit.

apozharski added 2 commits December 19, 2025 18:18

A, perhaps hacky, hiding of the runtime in the GLOBAL_METHOD_TABLE ov…

7d65cae

…erlay

cleanup some debugging

743b807

fix issue caused by moving device_function

12251c1

apozharski marked this pull request as draft December 23, 2025 11:49

dummy CPU functions now seem to get us further but check_ir is kickin…

a3f974b

…g us out of some KernelAbstractions compilations in e.g OpenCL.jl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable Ahead-of-Time Compilation by hiding the runtime functions in the `GLOBAL_METHOD_TABLE` #749

Enable Ahead-of-Time Compilation by hiding the runtime functions in the `GLOBAL_METHOD_TABLE` #749

apozharski commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

KSepetanc commented Dec 23, 2025 •

edited

Loading

Uh oh!

apozharski commented Dec 23, 2025

Uh oh!

KSepetanc commented Dec 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Enable Ahead-of-Time Compilation by hiding the runtime functions in the GLOBAL_METHOD_TABLE #749

Are you sure you want to change the base?

Enable Ahead-of-Time Compilation by hiding the runtime functions in the GLOBAL_METHOD_TABLE #749

Conversation

apozharski commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

KSepetanc commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apozharski commented Dec 23, 2025

Uh oh!

KSepetanc commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Enable Ahead-of-Time Compilation by hiding the runtime functions in the `GLOBAL_METHOD_TABLE` #749

Enable Ahead-of-Time Compilation by hiding the runtime functions in the `GLOBAL_METHOD_TABLE` #749

KSepetanc commented Dec 23, 2025 •

edited

Loading

KSepetanc commented Dec 23, 2025 •

edited

Loading