Skip to content

Commit b164364

Browse files
committed
fix(nvml-mock): use realistic per-link NVLink speeds in profiles
The a100/h100/b200 profiles previously used the bidirectional-per-link marketing figure (50/50/100 GB/s) rather than the per-link unidirectional line rate that `nvidia-smi nvlink -s` actually reports. Align them with the gb200/gb300 convention: - a100 (NVLink3): 25781 Mbps (25.781 GB/s/link) - h100 (NVLink4): 26562 Mbps (26.562 GB/s/link) - b200 (NVLink5): 53125 Mbps (53.125 GB/s/link, same silicon as gb200) Per-link x 2 x links still reproduces the marketed bidirectional aggregates (~600 GB/s, ~900 GB/s, ~1.8 TB/s). Update the docs example to the NVLink4 rate and regenerate Helm snapshots. Signed-off-by: Giulio Calzolari <gcalzolari@nvidia.com>
1 parent 52fc4b2 commit b164364

9 files changed

Lines changed: 13 additions & 13 deletions

File tree

deployments/nvml-mock/helm/nvml-mock/profiles/a100.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -373,7 +373,7 @@ devices:
373373
nvlink:
374374
version: 3
375375
links_per_gpu: 12
376-
bandwidth_per_link_mbps: 50000 # 50 GB/s per link = 600 GB/s total per GPU
376+
bandwidth_per_link_mbps: 25781 # NVLink3 25.781 GB/s/link unidirectional (`nvlink -s`); 12 links ~= 309 GB/s/GPU (600 GB/s bidir)
377377
c2c_enabled: false # no Grace C2C on DGX A100
378378
# DGX A100: 6x NVSwitch -> NV12 all-to-all. Switch-link auto-expansion
379379
# (switches declared + links_per_gpu > 0, and no explicit per-device links)

deployments/nvml-mock/helm/nvml-mock/profiles/b200.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -380,7 +380,7 @@ devices:
380380
nvlink:
381381
version: 5
382382
links_per_gpu: 18
383-
bandwidth_per_link_mbps: 100000 # 100 GB/s per link = 1.8 TB/s total per GPU
383+
bandwidth_per_link_mbps: 53125 # NVLink5 53.125 GB/s/link unidirectional (`nvlink -s`); 18 links ~= 0.95 TB/s/GPU (1.8 TB/s bidir)
384384
c2c_enabled: false # no Grace C2C
385385
# Standalone B200: negative-control profile -- no NVSwitch, no fabricmanager.
386386
# topo -m shows PCIe paths only (no NV#); nvmlDeviceGetGpuFabricInfo returns

deployments/nvml-mock/helm/nvml-mock/profiles/h100.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -389,7 +389,7 @@ devices:
389389
nvlink:
390390
version: 4
391391
links_per_gpu: 18
392-
bandwidth_per_link_mbps: 50000 # 50 GB/s per link = 900 GB/s total per GPU
392+
bandwidth_per_link_mbps: 26562 # NVLink4 26.562 GB/s/link unidirectional (`nvlink -s`); 18 links ~= 478 GB/s/GPU (900 GB/s bidir)
393393
c2c_enabled: false # no Grace C2C on HGX H100
394394
# Switch-link auto-expansion: switches declared + links_per_gpu > 0 (and no
395395
# explicit per-device links) makes the engine synthesize 18 active links per

deployments/nvml-mock/helm/nvml-mock/tests/__snapshot__/configmap_test.yaml.snap

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -385,7 +385,7 @@ should match snapshot with b200 profile:
385385
nvlink:
386386
version: 5
387387
links_per_gpu: 18
388-
bandwidth_per_link_mbps: 100000 # 100 GB/s per link = 1.8 TB/s total per GPU
388+
bandwidth_per_link_mbps: 53125 # NVLink5 53.125 GB/s/link unidirectional (`nvlink -s`); 18 links ~= 0.95 TB/s/GPU (1.8 TB/s bidir)
389389
c2c_enabled: false # no Grace C2C
390390
# Standalone B200: negative-control profile -- no NVSwitch, no fabricmanager.
391391
# topo -m shows PCIe paths only (no NV#); nvmlDeviceGetGpuFabricInfo returns
@@ -819,7 +819,7 @@ should match snapshot with default a100 profile:
819819
nvlink:
820820
version: 3
821821
links_per_gpu: 12
822-
bandwidth_per_link_mbps: 50000 # 50 GB/s per link = 600 GB/s total per GPU
822+
bandwidth_per_link_mbps: 25781 # NVLink3 25.781 GB/s/link unidirectional (`nvlink -s`); 12 links ~= 309 GB/s/GPU (600 GB/s bidir)
823823
c2c_enabled: false # no Grace C2C on DGX A100
824824
# DGX A100: 6x NVSwitch -> NV12 all-to-all. Switch-link auto-expansion
825825
# (switches declared + links_per_gpu > 0, and no explicit per-device links)
@@ -2275,7 +2275,7 @@ should match snapshot with h100 profile:
22752275
nvlink:
22762276
version: 4
22772277
links_per_gpu: 18
2278-
bandwidth_per_link_mbps: 50000 # 50 GB/s per link = 900 GB/s total per GPU
2278+
bandwidth_per_link_mbps: 26562 # NVLink4 26.562 GB/s/link unidirectional (`nvlink -s`); 18 links ~= 478 GB/s/GPU (900 GB/s bidir)
22792279
c2c_enabled: false # no Grace C2C on HGX H100
22802280
# Switch-link auto-expansion: switches declared + links_per_gpu > 0 (and no
22812281
# explicit per-device links) makes the engine synthesize 18 active links per

deployments/nvml-mock/helm/nvml-mock/tests/__snapshot__/daemonset_test.yaml.snap

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ should match snapshot with all overrides:
1818
template:
1919
metadata:
2020
annotations:
21-
checksum/config: 08de8a7b2e29b1c08e048f7ccdec437e023ba73a8f05ab3b915406fb3bcb1588
21+
checksum/config: 55bb2f5a4208bf44b344cc0227800acef90b155558045d62d3f2bbf1c55b2184
2222
labels:
2323
app.kubernetes.io/instance: custom
2424
app.kubernetes.io/name: nvml-mock
@@ -143,7 +143,7 @@ should match snapshot with b200 profile:
143143
template:
144144
metadata:
145145
annotations:
146-
checksum/config: a6a14db699978d788fdc32ee04278a4be30d980dfa36a8862e2c9810ec99840f
146+
checksum/config: 3f424b3ca8461088c7b01c5753e4f75f59b7ed0bd48cb6875a4c7786d6b05a91
147147
labels:
148148
app.kubernetes.io/instance: RELEASE-NAME
149149
app.kubernetes.io/name: nvml-mock
@@ -247,7 +247,7 @@ should match snapshot with default values:
247247
template:
248248
metadata:
249249
annotations:
250-
checksum/config: d0268a92daa1ffc7755ffb80e1f7e55577cecf07ea4793e2fc793a61e1c85c64
250+
checksum/config: 9b3eb20285d8ff1481993ed8af7a3f69d2999cbf8e9097c308c13cc191a79c44
251251
labels:
252252
app.kubernetes.io/instance: RELEASE-NAME
253253
app.kubernetes.io/name: nvml-mock

docs/configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -336,7 +336,7 @@ devices:
336336
nvlink:
337337
version: 4
338338
links_per_gpu: 18
339-
bandwidth_per_link_mbps: 25000
339+
bandwidth_per_link_mbps: 26562
340340
c2c_enabled: false
341341
links:
342342
- link: 0

pkg/gpu/mocknvml/configs/mock-nvml-config-a100.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -371,7 +371,7 @@ devices:
371371
nvlink:
372372
version: 3
373373
links_per_gpu: 12
374-
bandwidth_per_link_mbps: 50000 # 50 GB/s per link = 600 GB/s total per GPU
374+
bandwidth_per_link_mbps: 25781 # NVLink3 25.781 GB/s/link unidirectional (`nvlink -s`); 12 links ~= 309 GB/s/GPU (600 GB/s bidir)
375375
c2c_enabled: false # no Grace C2C on DGX A100
376376
# DGX A100: 6x NVSwitch -> NV12 all-to-all. Switch-link auto-expansion
377377
# (switches declared + links_per_gpu > 0, and no explicit per-device links)

pkg/gpu/mocknvml/configs/mock-nvml-config-b200.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -380,7 +380,7 @@ devices:
380380
nvlink:
381381
version: 5
382382
links_per_gpu: 18
383-
bandwidth_per_link_mbps: 100000 # 100 GB/s per link = 1.8 TB/s total per GPU
383+
bandwidth_per_link_mbps: 53125 # NVLink5 53.125 GB/s/link unidirectional (`nvlink -s`); 18 links ~= 0.95 TB/s/GPU (1.8 TB/s bidir)
384384
c2c_enabled: false # no Grace C2C
385385
# Standalone B200: negative-control profile -- no NVSwitch, no fabricmanager.
386386
# topo -m shows PCIe paths only (no NV#); nvmlDeviceGetGpuFabricInfo returns

pkg/gpu/mocknvml/configs/mock-nvml-config-h100.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -389,7 +389,7 @@ devices:
389389
nvlink:
390390
version: 4
391391
links_per_gpu: 18
392-
bandwidth_per_link_mbps: 50000 # 50 GB/s per link = 900 GB/s total per GPU
392+
bandwidth_per_link_mbps: 26562 # NVLink4 26.562 GB/s/link unidirectional (`nvlink -s`); 18 links ~= 478 GB/s/GPU (900 GB/s bidir)
393393
c2c_enabled: false # no Grace C2C on HGX H100
394394
# Switch-link auto-expansion: switches declared + links_per_gpu > 0 (and no
395395
# explicit per-device links) makes the engine synthesize 18 active links per

0 commit comments

Comments
 (0)