-
Notifications
You must be signed in to change notification settings - Fork 22
vfio: Add NvSwitches to the list of devices #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds support for binding NVIDIA NVSwitches to the vfio-pci driver for DGX/HGX systems requiring virtualization. This enables full passthrough and ServiceVM virtualization modes.
Changes:
- Added NVSwitch device discovery and binding in the
bindAll()method
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
391bbe8 to
df87aec
Compare
|
/ok to test df87aec |
|
/ok to test e16d9b6 |
e16d9b6 to
eef1c70
Compare
|
/ok to test eef1c70 |
eef1c70 to
2ed804a
Compare
|
Tested on a Viking, all four NvSwitches and all eight GPUs are bound to |
|
Add boolean to enable/disable binding On a NVLINK4 system if we want to support virt-model 1 8 GPUs bound to vfio-pci 4 NvSwitches and 8 GPUs bound to vfio-pci |
jojimt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
d189a3c to
498a27d
Compare
In when using DGX/HGX systems with virtualization we need to bind the NvSwitchess to vfio-pci as well. This enables us to use two virtualization modes 1. Full passthrough 2. ServiceVM We're explicitly not considering mode 3 with vGPUs. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Despite its name, GetGPUByPciBusID returns any NVIDIA PCI device (GPU, NVSwitch, etc.) at the specified address, not just GPUs. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
On a NVLINK4 system if we want to support virt-model 1 we need to bind the switches to vfio-pci, supporting virt-model 3 with fabric-manager running on the host we do not need to bind the switches to vfio-pci. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
In when using DGX/HGX systems with virtualization
we need to bind the NvSwitches to vfio-pci as well.
This enables us to use two virtualization modes
We're explicitly not considering mode 3 with vGPUs.