Skip to content

Conversation

@amemov
Copy link
Contributor

@amemov amemov commented Nov 5, 2025

  • Added shared CUDA context (OnceLock) and use it across luminal_cud and luminal_2
  • Replaced all CudaContext::new(0) with shared context; fix u32→usize cast

Still need to check the correctness. Was going to do it yesterday, but Lambda node got closed when I was about to continue working on it

TODO: Verify the statement above, remove the comments (my thoughts on some elements in the code)

… and luminal_2

- Replaced all CudaContext::new(0) with shared context; fix u32→usize cast
@amemov
Copy link
Contributor Author

amemov commented Nov 5, 2025

Tried running it locally and now I have this weird log with matmul example, which I didn't have on Lambda node with H100. Same log gets produced with changes in this PR and without them:

thread 'main' (222371) panicked at /home/anton/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cudarc-0.16.6/src/nvrtc/sys/mod.rs:579:18:
Expected symbol in library: DlSym { desc: "/opt/cuda/lib64/libnvrtc.so: undefined symbol: nvrtcGetNVVM" }
stack backtrace:
   0:     0x55cc31a7bf12 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::he2aba4f1d4ea1fbd
   1:     0x55cc31a8f4ef - core::fmt::write::h5b6d723e88f3973a
   2:     0x55cc31a4b151 - std::io::Write::write_fmt::hfb264c83805b0a19
   3:     0x55cc31a57442 - std::sys::backtrace::BacktraceLock::print::ha2f5782cdbcccd40
   4:     0x55cc31a5d0ec - std::panicking::default_hook::{{closure}}::ha8f0b2a22fbd9290
   5:     0x55cc31a5cf46 - std::panicking::default_hook::he3ca409e17c78e5f
   6:     0x55cc31a5d775 - std::panicking::panic_with_hook::h64da284505672a54
   7:     0x55cc31a5d60a - std::panicking::panic_handler::{{closure}}::hed6200fceb2a07a8
   8:     0x55cc31a57579 - std::sys::backtrace::__rust_end_short_backtrace::h7e403b75b11a9d15
   9:     0x55cc31a3e7dd - __rustc[3d1ee1440eab7a60]::rust_begin_unwind
  10:     0x55cc30d69770 - core::panicking::panic_fmt::h366dbf4a636b0c49
  11:     0x55cc30d69246 - core::result::unwrap_failed::h9547866b9642f875
  12:     0x55cc31a380cb - cudarc::nvrtc::sys::loaded::Lib::new::h1082bc3423ee8e35
  13:     0x55cc31a3566c - std::sync::poison::once::Once::call_once_force::{{closure}}::h7e6e354eac643a8a
  14:     0x55cc30d63617 - std::sys::sync::once::futex::Once::call::hf42d57099a0be3fe
  15:     0x55cc30d62765 - std::sync::once_lock::OnceLock<T>::initialize::hece25be20994c1b6
  16:     0x55cc31a36bb9 - cudarc::nvrtc::result::create_program::h82f243cbc1f0b495
  17:     0x55cc30de1e1d - cudarc::nvrtc::safe::compile_ptx_with_opts::hb76741ce5230d5ce
  18:     0x55cc30db0fcb - luminal_2::run::compile_kernels::hf29cc249943d02cd
  19:     0x55cc30dec1dd - luminal_2::extract::cost::h739fe2cbb9472171
  20:     0x55cc30ded095 - luminal_2::extract::search::h47d87febca593ef1
  21:     0x55cc30d6f3cc - matmul::main::hc201344469d6905e
  22:     0x55cc30d74733 - std::sys::backtrace::__rust_begin_short_backtrace::hd24bcec9c915583c
  23:     0x55cc30d75ed9 - std::rt::lang_start::{{closure}}::h21aaa901334c090d
  24:     0x55cc31a4cb50 - std::rt::lang_start_internal::h416e1497f666f6ed
  25:     0x55cc30d72685 - main
  26:     0x7fcec5027675 - <unknown>
  27:     0x7fcec5027729 - __libc_start_main
  28:     0x55cc30d69795 - _start
  29:                0x0 - <unknown>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant