|
1 | | -### Node profile description |
2 | | - |
3 | | - |
4 | | -<a id="node_list"></a> |
| 1 | +# Node profile description |
5 | 2 |
|
6 | 3 | <!-- Je trouve cela un peu futile de maintenir cette documentation à jour |
7 | 4 | manuellement. Peut-être pourrions nous créer dans ce dossier des sripts qui |
8 | 5 | pourraient créer une entrée RST et qui pourraient être exécutés sur un noeud au |
9 | 6 | Mila pour les mises à jour. --> |
| 7 | +<!-- TODO: Maybe add the tablesort feature of mkdocs: https://squidfunk.github.io/mkdocs-material/reference/data-tables/#sortable-tables --> |
| 8 | + |
| 9 | +| Name | GPU Model | Mem | # | CPUs | Sockets | Cores/Socket | Threads/Core | Memory (GB) | TmpDisk (TB) | Arch | Slurm Features | |
| 10 | +| --------------------- | --------- | --- | --- | ---- | ------- | ------------ | ------------ | ----------- | ------------ | ------ | ---------------------- | |
| 11 | +| **GPU Compute Nodes** | | | | | | | | | | | | |
| 12 | +| **cn-a[001-011]** | RTX8000 | 48 | 8 | 40 | 2 | 20 | 1 | 384 | 3.6 | x86_64 | turing,48gb | |
| 13 | +| **cn-b[001-005]** | V100 | 32 | 8 | 40 | 2 | 20 | 1 | 384 | 3.6 | x86_64 | volta,nvlink,32gb | |
| 14 | +| **cn-c[001-040]** | RTX8000 | 48 | 8 | 64 | 2 | 32 | 1 | 384 | 3 | x86_64 | turing,48gb | |
| 15 | +| **cn-g[001-029]** | A100 | 80 | 4 | 64 | 2 | 32 | 1 | 1024 | 7 | x86_64 | ampere,nvlink,80gb | |
| 16 | +| **cn-i001** | A100 | 80 | 4 | 64 | 2 | 32 | 1 | 1024 | 3.6 | x86_64 | ampere,80gb | |
| 17 | +| **cn-j001** | A6000 | 48 | 8 | 64 | 2 | 32 | 1 | 1024 | 3.6 | x86_64 | ampere,48gb | |
| 18 | +| **cn-k[001-004]** | A100 | 40 | 4 | 48 | 2 | 24 | 1 | 512 | 3.6 | x86_64 | ampere,nvlink,40gb | |
| 19 | +| **cn-l[001-091]** | L40S | 48 | 4 | 48 | 2 | 24 | 1 | 1024 | 7 | x86_64 | lovelace,48gb | |
| 20 | +| **cn-n[001-002]** | H100 | 80 | 8 | 192 | 2 | 96 | 1 | 2048 | 35 | x86_64 | hopper,nvlink,80gb | |
| 21 | +| **DGX Systems** | | | | | | | | | | | | |
| 22 | +| **cn-d[001-002]** | A100 | 40 | 8 | 128 | 2 | 64 | 1 | 1024 | 14 | x86_64 | ampere,nvlink,dgx,40gb | |
| 23 | +| **cn-d[003-004]** | A100 | 80 | 8 | 128 | 2 | 64 | 1 | 2048 | 28 | x86_64 | ampere,nvlink,dgx,80gb | |
| 24 | +| **cn-e[002-003]** | V100 | 32 | 8 | 40 | 2 | 20 | 1 | 512 | 7 | x86_64 | volta,nvlink,dgx,32gb | |
| 25 | +| **CPU Compute Nodes** | | | | | | | | | | | | |
| 26 | +| **cn-f[001-004]** | - | - | - | 32 | 1 | 32 | 1 | 256 | 10 | x86_64 | rome | |
| 27 | +| **cn-h[001-004]** | - | - | - | 64 | 2 | 32 | 1 | 768 | 7 | x86_64 | milan | |
| 28 | +| **cn-m[001-004]** | - | - | - | 96 | 2 | 48 | 1 | 1024 | 7 | x86_64 | sapphire | |
| 29 | + |
| 30 | +## Special nodes and outliers |
| 31 | + |
| 32 | + |
| 33 | +### DGX A100 |
10 | 34 |
|
11 | 35 |
|
12 | | -| Name | GPU Model | Mem | # | CPUs | Sockets | Cores/Socket | Threads/Core | Memory (GB) | TmpDisk (TB) | Arch | Slurm Features | |
13 | | -|------|-----------|-----|---|------|---------|--------------|--------------|-------------|--------------|------|---------------| |
14 | | -| **GPU Compute Nodes** | | | | | | | | | | | | |
15 | | -| **cn-a[001-011]** | RTX8000 | 48 | 8 | 40 | 2 | 20 | 1 | 384 | 3.6 | x86_64 | turing,48gb | |
16 | | -| **cn-b[001-005]** | V100 | 32 | 8 | 40 | 2 | 20 | 1 | 384 | 3.6 | x86_64 | volta,nvlink,32gb | |
17 | | -| **cn-c[001-040]** | RTX8000 | 48 | 8 | 64 | 2 | 32 | 1 | 384 | 3 | x86_64 | turing,48gb | |
18 | | -| **cn-g[001-029]** | A100 | 80 | 4 | 64 | 2 | 32 | 1 | 1024 | 7 | x86_64 | ampere,nvlink,80gb | |
19 | | -| **cn-i001** | A100 | 80 | 4 | 64 | 2 | 32 | 1 | 1024 | 3.6 | x86_64 | ampere,80gb | |
20 | | -| **cn-j001** | A6000 | 48 | 8 | 64 | 2 | 32 | 1 | 1024 | 3.6 | x86_64 | ampere,48gb | |
21 | | -| **cn-k[001-004]** | A100 | 40 | 4 | 48 | 2 | 24 | 1 | 512 | 3.6 | x86_64 | ampere,nvlink,40gb | |
22 | | -| **cn-l[001-091]** | L40S | 48 | 4 | 48 | 2 | 24 | 1 | 1024 | 7 | x86_64 | lovelace,48gb | |
23 | | -| **cn-n[001-002]** | H100 | 80 | 8 | 192 | 2 | 96 | 1 | 2048 | 35 | x86_64 | hopper,nvlink,80gb | |
24 | | -| **DGX Systems** | | | | | | | | | | | | |
25 | | -| **cn-d[001-002]** | A100 | 40 | 8 | 128 | 2 | 64 | 1 | 1024 | 14 | x86_64 | ampere,nvlink,dgx,40gb | |
26 | | -| **cn-d[003-004]** | A100 | 80 | 8 | 128 | 2 | 64 | 1 | 2048 | 28 | x86_64 | ampere,nvlink,dgx,80gb | |
27 | | -| **cn-e[002-003]** | V100 | 32 | 8 | 40 | 2 | 20 | 1 | 512 | 7 | x86_64 | volta,nvlink,dgx,32gb | |
28 | | -| **CPU Compute Nodes** | | | | | | | | | | | | |
29 | | -| **cn-f[001-004]** | - | - | - | 32 | 1 | 32 | 1 | 256 | 10 | x86_64 | rome | |
30 | | -| **cn-h[001-004]** | - | - | - | 64 | 2 | 32 | 1 | 768 | 7 | x86_64 | milan | |
31 | | -| **cn-m[001-004]** | - | - | - | 96 | 2 | 48 | 1 | 1024 | 7 | x86_64 | sapphire | |
32 | | - |
33 | | -#### Special nodes and outliers |
34 | | - |
35 | | - |
36 | | -##### DGX A100 |
37 | | - |
38 | | - |
39 | | -<a id="dgx_a100_nodes"></a> |
40 | | - |
41 | 36 | DGX A100 nodes are NVIDIA appliances with 8 NVIDIA A100 Tensor Core GPUs. Each |
42 | 37 | GPU has either 40 GB or 80 GB of memory, for a total of 320 GB or 640 GB per |
43 | 38 | appliance. The GPUs are interconnected via 6 NVSwitches which allow for 600 GB/s |
|
0 commit comments