Skip to content

Conversation

@johnwikman
Copy link
Contributor

This PR adds a check in misc/test-spec.mc to check that there is at least one CUDA device that we can compile and run programs on before trying to run the accelerate tests. This is necessary for containerized build environments which will have nvcc installed, but no access to any GPUs.

This check is done by a function cudaGetDeviceCount that I added to a file cuda/sys.mc. This function will return None () if it cannot run CUDA programs on the system. Otherwise it will return the number of available devices wrapped in a Some. There is also a file test/examples/cuda/device_count.mc which can be used to quickly test the behavior of this function.

I tested this under various runtime conditions on a server that has access to 4 GPUs. The runtime conditions were constrained through containerization. The containers were launched as:

  1. podman run --rm -it localhost/mikinglang/baseline:v8-debian12.6-linux-amd64 bash
  2. podman run --rm -it localhost/mikinglang/baseline:v8-cuda11.4-linux-amd64 bash
  3. podman run --rm -it --device "nvidia.com/gpu=all" localhost/mikinglang/baseline:v8-cuda11.4-linux-amd64 bash

The first container neither has GPU or nvcc. The second one has nvcc but no GPUs. The third one has nvcc and access to 4 GPUs.

We test this by running this install script:

git clone https://github.com/johnwikman/miking.git \
&& cd miking \
&& git checkout cudacheck2 \
&& make install

Followed by this to compile and run the check program:

mi compile src/test/examples/cuda/device_count.mc
./device_count

We get the expected output for each respective container:

  1. Could not compile and run CUDA programs in your environment.
  2. Could not compile and run CUDA programs in your environment.
  3. Found 4 CUDA devices on your system.

Also, if running with CUDA_VISIBLE_DEVICES="1,2" ./device_count on the 3rd container, then we instead get the output Found 2 CUDA devices on your system. which is to be expected.

@david-broman david-broman merged commit 80cd279 into miking-lang:develop Apr 6, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants