We need to add an entry to the known issues with CUDA and HIP, that dispatching a kernel that calls an atomic function that is not lock-free (e.g atomic operation on a Kokkos::complex<double>) will fail**. Also, see desul/desul#110.
I actually believe the same goes for SYCL when building a queue to a device that is not the one that Kokkos was initialized with.
** what does actually happen there? Do we know?
cc @tcclevenger @Rombur @masterleinad