-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
For nvptx-none toolchain testing, we're using nvptx-none-run to launch kernels on a 1 x 1 x 1 grid with 1 x 1 x 1 threads. We'd like to use cuCtxSetLimit(CU_LIMIT_STACK_SIZE) to increase the per-thread stack size from its tiny default value (1 KiB?).
Even though a cuCtxGetLimit(CU_LIMIT_STACK_SIZE) does acknowledge the value set, if this is set "too high", inscrutable errors (CUDA_ERROR_ILLEGAL_ADDRESS) may result from later cuModuleLoadData (?!) or cuLaunchKernel calls.
It is unclear how to safely maximize the per-thread stack size.
Metadata
Metadata
Assignees
Labels
No labels