You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 19, 2025. It is now read-only.
Kokkos 4.0 changed many class members set with Kokkos::initialize() to inline static T types. With this change it seems there is an issue with pybind11 and setting these members persistently when called from python.
Whenever using cuda, the TileSizeProperties attribute maxThreads is being set to zeros, and causes an abort at the first MDRange execution.
When Kokkos::initialize() is called (from python bound function), cudaProp.maxThreadsPerMultiProcessor (from here ) reports 1024, however, by the time we get to the MDRange policy here, the space.impl_internal_space_instance()->m_maxThreadsPerSM is 0. This causes an abort at this check here.
I am only having an issue with CUDA, and it works fine with OpenMP and Serial backends. It has been consistent with every host/device compiler I have tried.
Kokkos 4.0 changed many class members set with
Kokkos::initialize()toinline static Ttypes. With this change it seems there is an issue with pybind11 and setting these members persistently when called from python.Whenever using cuda, the
TileSizePropertiesattributemaxThreadsis being set to zeros, and causes an abort at the first MDRange execution.When
Kokkos::initialize()is called (from python bound function),cudaProp.maxThreadsPerMultiProcessor(from here ) reports 1024, however, by the time we get to the MDRange policy here, thespace.impl_internal_space_instance()->m_maxThreadsPerSMis 0. This causes an abort at this check here.I am only having an issue with CUDA, and it works fine with OpenMP and Serial backends. It has been consistent with every host/device compiler I have tried.
Primarily gcc 9.4.0/intel19.04 + CUDA 11.7