Skip to content

Commit c888476

Browse files
svcnvidia-nemo-cimalay-nagdanemo-autobot[bot]Automation Botdimapihtar
authored
cp: 26.06 perf summary updates (4384) into r0.5.0 (#4398)
Signed-off-by: Malay Nagda <malayn@nvidia.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com> Co-authored-by: malay-nagda <malayn@nvidia.com> Co-authored-by: nemo-autobot[bot] <272199896+nemo-autobot[bot]@users.noreply.github.com> Co-authored-by: Automation Bot <nemo-toolkit@nvidia.com> Co-authored-by: dimapihtar <37850217+dimapihtar@users.noreply.github.com>
1 parent fcbb603 commit c888476

1 file changed

Lines changed: 3 additions & 4 deletions

File tree

docs/performance-summary.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -62,17 +62,16 @@ The performance data includes:
6262

6363
| System | #-GPUs | Precision | GBS | MBS | Sequence Length | TP | PP | CP | VP | EP | Tokens / sec / GPU | Model TFLOP / sec / GPU |
6464
|--------|--------|-----------|-----|-----|-----------------|----|----|----|----|----|-----------------------|-------------------------|
65-
| DGX-GB300 | 64 | BF16 | 1280 | 4 | 4096 | 1 | 1 | 1 | n/a | 64 | 20635 | 673 |
66-
| DGX-GB200 | 64 | BF16 | 1280 | 4 | 4096 | 1 | 1 | 1 | n/a | 64 | 17770 | 580 |
67-
| DGX-H100 | 64 | BF16 | 1280 | 1 | 4096 | 1 | 4 | 1 | n/a | 8 | 5860 | 191 |
65+
| DGX-GB300 | 64 | MXFP8 | 1280 | 4 | 4096 | 1 | 1 | 1 | n/a | 16 | 33166 | 1081 |
66+
| DGX-GB200 | 64 | MXFP8 | 1280 | 4 | 4096 | 1 | 1 | 1 | n/a | 64 | 28947 | 943 |
6867

6968
#### Model: Qwen3_30B_a3B
7069

7170
| System | #-GPUs | Precision | GBS | MBS | Sequence Length | TP | PP | CP | VP | EP | Tokens / sec / GPU | Model TFLOP / sec / GPU |
7271
|--------|--------|-----------|-----|-----|-----------------|----|----|----|----|----|-----------------------|-------------------------|
7372
| DGX-GB300 | 8 | MXFP8 | 512 | 8 | 4096 | 1 | 1 | 1 | n/a | 8 | 45275 | 1041 |
7473
| DGX-GB200 | 8 | MXFP8 | 512 | 4 | 4096 | 1 | 1 | 1 | n/a | 8 | 40706 | 936 |
75-
| DGX-H100 | 16 | FP8 | 1024 | 1 | 4096 | 1 | 1 | 1 | n/a | 16 | 8467 | 195 |
74+
| DGX-H100 | 16 | FP8 | 1024 | 1 | 4096 | 1 | 1 | 1 | n/a | 16 | 8826 | 203 |
7675

7776
#### Model: Qwen3_235B_a22B
7877

0 commit comments

Comments
 (0)