feature request: runTheMatrix.py should assign a different GPU to each job

`runTheMatrix.py` creates and executes jobs without any kind of GPU assignment.

On a machine with a single GPU, his is not an issue.

On a machine with more than one GPU, for example
```
$ rocmComputeCapabilities 
   0    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   1    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   2    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   3    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   4    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   5    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   6    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
   7    gfx90a:sramecc+:xnack-    AMD Instinct MI250X
```
or
```
$ cudaComputeCapabilities 
   0     8.9    NVIDIA L4
   1     8.9    NVIDIA L4
   2     8.9    NVIDIA L4
   3     8.9    NVIDIA L4
```
the result is that all jobs try to use all GPUs, which is quite inefficient.

A better approach would be to assign a different GPU to each job, for example in a round-robin fashion.
If there are more concurrent jobs than GPUs, the GPUs will be shared - but to a much lesser extent than now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: runTheMatrix.py should assign a different GPU to each job #47337

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feature request: runTheMatrix.py should assign a different GPU to each job #47337

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions