Skip to content

Latest commit

 

History

History
213 lines (181 loc) · 12.4 KB

File metadata and controls

213 lines (181 loc) · 12.4 KB

Slurm + MIG Configuration Guide

This document describes how to integrate Slurm with MIG enabled Nvidia GPUs. Be sure to read the MIG getting started guide if you haven't already.

This guide assumes that administrators have read, understood and partitioned their Nvidia GPUs as desired to meet the needs of their users and applications (use of Nvidia's mig-parted is highly recommended). Slurm will treat MIG devices as separate and distinct GPUs enabling multiple jobs and users to utilize a single GPU without any contention.

The following steps show how to use the Mig Detection program and use a single A100 system as an example.

1) Build the MIG detection program.

Build the MIG detection program with a single command. Note that Cuda and gcc need to be installed for the program to build correctly:

sudo ln -s /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so
gcc -g -o mig -I/usr/lib/x86_64-linux-gnu -I/usr/include mig.c -lnvidia-ml

If nvml.h and libnvidia-ml.so are not in standard locations the above command will need to be adjusted accordingly.

2) Run the MIG detection program once on each node with MIG devices.

This program will detect all MIG devices and other Nvidia GPUs and create a corresponding gres.yml file in the working directory. gres.yml can be used with a Slurm Ansible role to generate a gres.conf file.

$ nvidia-smi
Thu Mar 18 08:05:25 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Graphics Device     On   | 00000000:65:00.0 Off |                   On |
| 35%   56C    P0    43W / 200W |     13MiB / 48675MiB |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| MIG devices:                                                                |
+------------------+----------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
|      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
|                  |                      |        ECC|                       |
|==================+======================+===========+=======================|
|  0    2   0   0  |      7MiB / 24192MiB | 56      0 |  4   0    2    0    0 |
|                  |      0MiB /   127MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
|  0    7   0   1  |      1MiB /  5888MiB | 14      0 |  1   0    0    0    0 |
|                  |      0MiB /    31MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
|  0    8   0   2  |      1MiB /  5888MiB | 14      0 |  1   0    0    0    0 |
|                  |      0MiB /    31MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
|  0    9   0   3  |      1MiB /  5888MiB | 14      0 |  1   0    0    0    0 |
|                  |      0MiB /    31MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

$ ./mig
GPU count 1
Success
$ ls
gres.yml  LICENSE  mig  mig.c  README.md

The contents of the files can be viewed and tweaked. For example, sites can change the Type attribute of the MIG devices to something more consistent with the system or change the default list of cgroup's allowed devices:

jpellman@emgsoftbuild:~/23196/slurm-mig-discovery$ head -n 18 gres.yml 
# GPU 0 MIG 0 /proc/driver/nvidia/capabilities/gpu0/mig/gi3/access
  - File: /dev/nvidia-caps/nvidia-cap30
    Name: gpu
    NodeName: semc-gpu34
    Type: 1g.24gb

# GPU 0 MIG 1 /proc/driver/nvidia/capabilities/gpu0/mig/gi4/access
  - File: /dev/nvidia-caps/nvidia-cap39
    Name: gpu
    NodeName: semc-gpu34
    Type: 1g.24gb

# GPU 0 MIG 2 /proc/driver/nvidia/capabilities/gpu0/mig/gi5/access
  - File: /dev/nvidia-caps/nvidia-cap48
    Name: gpu
    NodeName: semc-gpu34
    Type: 1g.24gb

3) Update Slurm configuration from newly created files.

Add text in gres.yml to your Ansible configuration.

4) Enable and configure cgroups in slurm.conf and cgroups.conf.

Slurm must be configured to use cgroups in order to enforce MIG device isolation across users and jobs. Ensure the following parameters are present in slurm.conf:

ProctrackType=proctrack/cgroup
TaskPlugin=task/cgroup

In addition, ensure that Slurm constrains devices with the following entry in cgroup.conf:

ConstrainDevices=yes

See Slurm's cgroup.conf and cgroup documentation for more information.

5) Start/restart slurmctld on the head node and slurmd on all the affected compute nodes.

Be sure to start/restart the slurmctld on the head node and the slurmd on all the MIG nodes after configuring all the devices. In addition, anytime the MIG or GPU configuration is changed repeat steps 2, 3 and 5.

6) Verify correct integration through sample jobs and slurm commands.

With Slurm configured and started you can now verify correct operation. Check that the GPUs and MIG devices are present via "scontrol show nodes". Run some GPU jobs requesting the new MIG devices.

$ scontrol show nodes
NodeName=p1-019 Arch=x86_64 CoresPerSocket=8 
   CPUAlloc=0 CPUTot=16 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=gpu:1g.6gb:3,gpu:4g.24gb:1
   NodeAddr=p1-019 NodeHostName=p1-019 Version=20.11.4
   OS=Linux 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 
   RealMemory=1 AllocMem=0 FreeMem=48895 Sockets=1 Boards=1
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=debug 
   BootTime=2021-03-15T06:14:57 SlurmdStartTime=2021-03-18T09:01:00
   CfgTRES=cpu=16,mem=1M,billing=16
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
   Comment=(null)

$ srun --gres=gpu:4g.24gb nvidia-smi
Thu Mar 18 08:05:12 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Graphics Device     On   | 00000000:65:00.0 Off |                   On |
| 35%   56C    P0    43W / 200W |     13MiB / 48675MiB |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| MIG devices:                                                                |
+------------------+----------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
|      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
|                  |                      |        ECC|                       |
|==================+======================+===========+=======================|
|  0    2   0   0  |      7MiB / 24192MiB | 56      0 |  4   0    2    0    0 |
|                  |      0MiB /   127MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Note that only the requested MIG device is visible to the job.

$ srun --gres=gpu:1g.6gb:2 nvidia-smi
Thu Mar 18 08:07:55 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Graphics Device     On   | 00000000:65:00.0 Off |                   On |
| 35%   56C    P0    43W / 200W |     13MiB / 48675MiB |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| MIG devices:                                                                |
+------------------+----------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
|      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
|                  |                      |        ECC|                       |
|==================+======================+===========+=======================|
|  0    7   0   0  |      1MiB /  5888MiB | 14      0 |  1   0    0    0    0 |
|                  |      0MiB /    31MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
|  0    8   0   1  |      1MiB /  5888MiB | 14      0 |  1   0    0    0    0 |
|                  |      0MiB /    31MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+