-
Notifications
You must be signed in to change notification settings - Fork 152
Open
Description
I tried to build some regent applications with CUDA 12.8 and see the following error:
(legate) root@eos0143:/opt/legate/legion/language# LEGION_BACKTRACE_ON_ERROR=1 OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 mpirun -n 1 --bind-to none entrypoint.sh ./stencil -ll:gpu 1
LD_PRELOAD: /opt/conda/envs/legate/lib/libtcmalloc.so
error in cuModuleLoadData (CUDA_ERROR_INVALID_PTX): a PTX JIT compilation failed
Investating further, I found that Regent's generated PTX contains this snippet:
;
.extern .func (.param .b32 func_retval0) llvm.nvvm.tanh.approx.f32
(
.param .b32 llvm.nvvm.tanh.approx.f32_param_0
)
which is invalid PTX:
(legate) root@eos0114:/opt/legate/legion/language# /usr/local/cuda/bin/ptxas stencil.ptx
ptxas stencil.ptx, line 126; fatal : Parsing error near '.nvvm': syntax error
ptxas fatal : Ptx assembly aborted due to errors
The issue is that this code comes from CUDA's shipped libdevice.10.bc
, so it's unclear what we're supposed to do here. Adding a dead-code-elimination pass to Terra's PTX generation would be a workaround for this issue as long as an application does not need to use tanh
(https://github.com/terralang/terra/blob/master/src/tcuda.cpp#L87).
I can confirm that this issue does not arise on CUDA 12.5.1, and it's on my backlog to get a tighter bound for the last working CUDA version.
Metadata
Metadata
Assignees
Labels
No labels