Runtime error: ubuntu w/ nvidia docker container

## Error:

Error occurs even on simple program compilation, written in [docs](https://kokkos.org/pykokkos/policies.html#rangepolicy)

```python
# test.py
import pykokkos as pk
import numpy as np

@pk.workunit
def work(wid, a):
    a[wid] += 1

def main():
    N = 10
    a = np.random.randint(100, size=(N))
    print(a)

    pk.parallel_for("work", pk.RangePolicy(0, N), work, a=a)
    # OR
    # pk.parallel_for("work", N, work, a=a)
    print(a)

main()
```

out:
```bash
$ python test.py

Kokkos::Cuda::initialize WARNING: Cuda is allocating into UVMSpace by default
                                  without setting CUDA_MANAGED_FORCE_DEVICE_ALLOC=1 or
                                  setting CUDA_VISIBLE_DEVICES.
                                  This could on multi GPU systems lead to severe performance"
                                  penalties.
[41 22 98 19 49 54 31 23 43  5]
Traceback (most recent call last):
  File "/root/test.py", line 18, in <module>
    main()
  File "/root/test.py", line 13, in main
    pk.parallel_for("work", pk.RangePolicy(0, N), work, a=a)
  File "/root/pykokkos/pykokkos/interface/parallel_dispatch.py", line 158, in parallel_for
    runtime_singleton.runtime.run_workunit(
  File "/root/pykokkos/pykokkos/core/runtime.py", line 153, in run_workunit
    return self.execute_workunit(name, policy, workunit, operation, parser, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/pykokkos/pykokkos/core/runtime.py", line 202, in execute_workunit
    return self.execute(workunit, module_setup, members, execution_space, policy=policy, name=name, operation=operation, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/pykokkos/pykokkos/core/runtime.py", line 291, in execute
    result = self.call_wrapper(entity, members, args, module)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/pykokkos/pykokkos/core/runtime.py", line 407, in call_wrapper
    return func(**args)
           ^^^^^^^^^^^^
RuntimeError: Unable to cast Python instance of type <class 'kokkos.libpykokkos.KokkosExecutionSpace_OpenMP'> to C++ type 'Kokkos::OpenMP'

````
I haven't seen any issues like this before.
---
## System:

Docker container: `nvidia/cuda:12.4.1-devel-ubuntu22.04`

Related dockerfile info:
```dockerfile
FROM nvidia/cuda:12.4.1-devel-ubuntu22.04

# Install development tools
RUN apt-get -y update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    build-essential \
    cmake \
    gcc \
    micro \
    nano \
    git \
    g++ \
    openssh-server \
    && rm -rf /var/lib/apt/lists/*
# Further content is unrelated; 
# all further instructions entered in manual mode, 
# like conda installation and pykokkos installation from the installation link
```

Docker run instructions:
```sh
docker run -dit \
    --gpus all \
    --name <NAME> \
    -p 2222:2222 \
    -v $HOME:/host-home \
    --restart unless-stopped \
    <NAME>
```

nvidia-smi:
```
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX 5000 Ada Gene...    Off |   00000000:AC:00.0 Off |                  Off |
| 30%   32C    P8              6W /  250W |      30MiB /  32760MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA RTX 5000 Ada Gene...    Off |   00000000:CA:00.0 Off |                  Off |
| 30%   29C    P8              1W /  250W |      14MiB /  32760MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
```
nvcc --version:
```
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime error: ubuntu w/ nvidia docker container #65

Error:

I haven't seen any issues like this before.

System:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Runtime error: ubuntu w/ nvidia docker container #65

Description

Error:

I haven't seen any issues like this before.

System:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions