-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Description
Enhance the DRA driver to support RDMA (Remote Direct Memory Access) device mounts for SR-IOV Virtual Functions, similar to how the sriov-network-device-plugin handles RDMA-capable devices.
Motivation
Many SR-IOV network adapters (particularly those from Mellanox/NVIDIA and Intel) support RDMA capabilities through technologies like InfiniBand, RoCE (RDMA over Converged Ethernet), and iWARP. Applications using these RDMA-capable VFs need access to the appropriate device nodes to utilize RDMA features for high-performance, low-latency networking.
Currently, the driver supports:
- Standard kernel networking driver mode
- VFIO-PCI driver binding with device node injection (
/dev/vfio/*) - Vhost-user socket mounting (
/dev/vhost-net,/dev/net/tun)
However, it lacks support for mounting RDMA-specific device nodes that are required for RDMA operations.
Current Implementation
The driver currently creates CDI (Container Device Interface) specifications with device nodes in pkg/devicestate/state.go:
// If device is bound to vfio-pci, add VFIO device nodes
if config.Driver == "vfio-pci" {
deviceNodes = append(deviceNodes, &cdispec.DeviceNode{
Path: devFileContainer,
HostPath: devFileHost,
Type: "c", // character device
})
}
// if addVhostMount is true, we add a volume mount for the vhost device
if config.AddVhostMount {
deviceNodes = append(deviceNodes, &cdispec.DeviceNode{
Path: "/dev/vhost-net",
HostPath: "/dev/vhost-net",
Type: "c", // character device
})
}Proposed Solution
Implement RDMA device mount support similar to the sriov-network-device-plugin approach:
1. Add VfConfig Parameter
Add a new optional parameter to the VfConfig API (pkg/api/virtualfunction/v1alpha1/api.go):
type VfConfig struct {
metav1.TypeMeta `json:",inline"`
Driver string `json:"driver,omitempty"`
AddVhostMount bool `json:"addVhostMount,omitempty"`
AddRdmaMount bool `json:"addRdmaMount,omitempty"` // NEW
IfName string `json:"ifName,omitempty"`
NetAttachDefName string `json:"netAttachDefName,omitempty"`
NetAttachDefNamespace string `json:"netAttachDefNamespace,omitempty"`
}2. Detect RDMA Capability
Add detection logic to identify RDMA-capable devices:
- Check for RDMA devices in
/sys/class/infiniband/or/sys/class/net/<interface>/device/infiniband/ - Identify the associated RDMA device name for each VF
- Store RDMA capability as a device attribute during discovery
3. Mount RDMA Device Nodes
When AddRdmaMount is enabled, inject appropriate RDMA device nodes into the container:
For InfiniBand/RoCE devices:
if config.AddRdmaMount {
// Mount the character devices for RDMA operations
rdmaDevices := []string{
"/dev/infiniband/rdma_cm", // Connection Manager
"/dev/infiniband/uverbs0", // User Verbs (device-specific, may vary)
"/dev/infiniband/issm0", // InfiniBand Subnet Manager (if applicable)
"/dev/infiniband/umad0", // InfiniBand Management Datagram (if applicable)
}
for _, rdmaDevice := range rdmaDevices {
if hostFileExists(rdmaDevice) {
deviceNodes = append(deviceNodes, &cdispec.DeviceNode{
Path: rdmaDevice,
HostPath: rdmaDevice,
Type: "c",
})
}
}
}4. Update Host Helper Functions
Add helper functions in pkg/host/host.go:
// GetRDMADeviceForPCI returns the RDMA device name associated with a PCI address
func GetRDMADeviceForPCI(pciAddr string) ([]string, error)
// VerifyRDMACapability checks if a device supports RDMA
func VerifyRDMACapability(pciAddr string) (bool, error)5. Add Device Attributes
During device discovery in pkg/devicestate/discovery.go, add RDMA-related attributes:
rdmaCapable: boolean indicating RDMA supportrdmaDevices: list of associated RDMA device pathsrdmaProtocol: type of RDMA protocol (InfiniBand, RoCE, iWARP)
Implementation Details
Files to modify:
-
pkg/api/virtualfunction/v1alpha1/api.go- Add
AddRdmaMount boolfield toVfConfigstruct
- Add
-
pkg/devicestate/state.go- Add RDMA device node injection logic in
applyConfigOnDevice() - Add environment variables for RDMA device information
- Add RDMA device node injection logic in
-
pkg/devicestate/discovery.go- Add RDMA capability detection during device discovery
- Store RDMA-related attributes for each VF
-
pkg/host/host.go- Add
GetRDMADeviceForPCI()function - Add
VerifyRDMACapability()function - Add helper to list RDMA devices in
/sys/class/infiniband/
- Add
-
Documentation
- Update README.md VfConfig parameters section
- Add new demo example in
demo/rdma-config/ - Update Helm chart values and templates
Example Usage
apiVersion: resource.k8s.io/v1
kind: ResourceClaimTemplate
metadata:
name: sriov-rdma-vf
spec:
spec:
devices:
requests:
- name: rdma-vf
exactly:
deviceClassName: sriovnetwork.openshift.io
selectors:
- cel:
expression: device.attributes["sriovnetwork.openshift.io"].rdmaCapable == "true"
configuration:
opaque:
parameters:
apiVersion: sriovnetwork.openshift.io/v1alpha1
kind: VfConfig
addRdmaMount: true
netAttachDefName: rdma-networkReference Implementation
The sriov-network-device-plugin implements RDMA support:
Testing Requirements
- Unit tests for RDMA device detection
- Unit tests for RDMA device mount logic
- Integration tests with RDMA-capable hardware
- Verify compatibility with:
- Mellanox/NVIDIA ConnectX NICs
- Intel E810 series with RoCE
- InfiniBand adapters
Benefits
- Enable RDMA workloads to use SR-IOV VFs through DRA
- Support high-performance computing (HPC) applications
- Enable AI/ML training workloads requiring RDMA
- Provide feature parity with sriov-network-device-plugin
- Maintain compatibility with existing RDMA libraries and tools
Acceptance Criteria
- VfConfig API includes
addRdmaMountparameter - RDMA capability is detected during device discovery
- RDMA devices are properly attributed to VFs
- RDMA device nodes are mounted when
addRdmaMount: true - Appropriate environment variables are set for RDMA devices
- Documentation is updated with RDMA configuration examples
- Unit and integration tests validate RDMA functionality
- Helm chart supports RDMA configuration
Additional Context
RDMA provides significant performance benefits for:
- Storage applications (NVMe over Fabrics, iSER)
- Distributed computing and MPI workloads
- High-frequency trading applications
- AI/ML training with distributed frameworks
- Database replication and clustering
This feature is essential for users migrating from the sriov-network-device-plugin to the DRA driver without losing RDMA functionality.