-
Notifications
You must be signed in to change notification settings - Fork 86
Open
Description
DCGM Version: 4.4.1 / 4.5.2
Description:
The dcgmi diag output displays a field labeled "GPU Device IDs Detected" which shows identical values for all GPUs:
| GPU Device IDs Detected | 3182, 3182, 3182, 3182, 3182, 3182, 3182 |
This is misleading because:
- "GPU Device IDs" implies unique per-GPU identifiers
- Users expect N different values for N GPUs
- The actual values shown are PCI Device IDs (hardware SKU), which are identical for all GPUs of the same model
Expected behavior:
Either:
- Rename the field to accurately reflect what it shows: "PCI Device IDs Detected" or "GPU Model IDs"
- Or show actual unique GPU identifiers (UUIDs, indices, or serial numbers)
Reproduction:
bash
dcgmi diag --run 1
On any system with multiple identical GPUs.
Repro stdout examples:
Successfully ran diagnostic for group.
+---------------------------+------------------------------------------------+
| Diagnostic | Result |
+===========================+================================================+
|----- Metadata ----------+------------------------------------------------|
| DCGM Version | 4.4.1 |
| Driver Version Detected | 580.95.05 |
| GPU Device IDs Detected | 3182, 3182, 3182, 3182, 3182, 3182, 3182, 3182 |
|----- Deployment --------+------------------------------------------------|
| software | Pass |
| | GPU0: Pass |
| | GPU1: Pass |
| | GPU2: Pass |
| | GPU3: Pass |
| | GPU4: Pass |
| | GPU5: Pass |
| | GPU6: Pass |
| | GPU7: Pass |
+---------------------------+------------------------------------------------+
Successfully ran diagnostic for group.
+---------------------------+------------------------------------------------+
| Diagnostic | Result |
+===========================+================================================+
|----- Metadata ----------+------------------------------------------------|
| DCGM Version | 4.5.2 |
| Driver Version Detected | 580.126.09 |
| GPU Device IDs Detected | 3182, 3182, 3182, 3182, 3182, 3182, 3182 |
|----- Deployment --------+------------------------------------------------|
| software | Pass |
| | GPU0: Pass |
| | GPU1: Pass |
| | GPU2: Pass |
| | GPU3: Pass |
| | GPU4: Pass |
| | GPU5: Pass |
| | GPU6: Pass |
+---------------------------+------------------------------------------------+
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels