Skip to content

does device plugin support GB10 (NVIDIA DGX Spark) #1482

@sceneryback

Description

@sceneryback

I deployed nvidia device plugin on a new NVIDIA DGX Spark machine, the output of nvidia-smi is

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GB10                    On  |   0000000F:01:00.0 Off |                  N/A |
| N/A   40C    P8              4W /  N/A  | Not Supported          |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|

And device plugin can not be started with error logs:

E1029 10:26:46.968061       1 main.go:173] error starting plugins: error getting plugins: unable to create plugins: failed to construct resource managers: error building device map: error building device map from config.resources: error building GPU device map: error visiting device: error building Device: error getting device memory: Not Supported

It seems that it can not get the memory of GB10, same to nvidia-smi.
And I notice the driver 580.95.05 is quite new:

Image

so do we support GB10 now?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions