Skip to content

Commit cf48037

Browse files
committed
VEP-183: NetworkDevicesWithDRA API design
Document support for DRA-provisioned network devices, specifically SR-IOV NICs. Key additions: - ResourceClaimNetworkSource API for specifying DRA networks in spec.networks - SR-IOV integration details: device allocation via DRA, configuration via existing KubeVirt networks API. - Custom MAC address support through kubevirt.io/dra-networks. - NetworkDevicesWithDRA feature gate (Alpha). - Example VMI YAML with DRA SR-IOV network configuration. Network DRA maintains mutual exclusivity with traditional Multus-based SR-IOV per VM. The existing Multus SR-IOV API remains fully supported and unchanged. Assisted-by: claude-4.5-sonnet Signed-off-by: Or Shoval <oshoval@redhat.com>
1 parent c64c5f2 commit cf48037

File tree

1 file changed

+292
-0
lines changed
  • veps/sig-network/183-dra-network

1 file changed

+292
-0
lines changed
Lines changed: 292 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,292 @@
1+
# VEP #183: SR-IOV Network DRA Support
2+
3+
## Release Signoff Checklist
4+
5+
Items marked with (R) are required *prior to targeting to a milestone / release*.
6+
7+
- [x] (R) Enhancement issue created, which links to VEP dir in [kubevirt/enhancements] (not the initial VEP PR)
8+
9+
## Overview
10+
11+
This proposal adds support for DRA (Dynamic Resource Allocation) provisioned SR-IOV network devices in KubeVirt.
12+
It extends the existing KubeVirt networks API with a new `ResourceClaimNetworkSource` type, allowing SR-IOV NICs to be allocated via DRA while maintaining compatibility with the existing Multus-based SR-IOV approach.
13+
14+
This VEP builds upon the core DRA infrastructure defined in VEP #10 ([kubevirt/enhancements/pull/11](https://github.com/kubevirt/enhancements/pull/11)) to add support for network devices, specifically SR-IOV NICs.
15+
16+
## Motivation
17+
18+
DRA adoption for network devices is important for KubeVirt so that network device vendors can expect
19+
the same level of control when using Virtual Machines as they have with Containers.
20+
DRA allows network device vendors fine-grained control over device allocation and topology.
21+
22+
## Goals
23+
24+
- Introduce the API changes needed to consume DRA-enabled SR-IOV network devices in KubeVirt
25+
- Introduce how KubeVirt will consume SR-IOV devices via external DRA drivers
26+
- Seamlessly support DRA-based SR-IOV use cases available to containers in KubeVirt VMIs
27+
- Support custom MAC addresses for DRA-based SR-IOV networks
28+
29+
## Non Goals
30+
31+
- Replace existing Multus-based SR-IOV network integration (remains fully supported)
32+
- Deploy DRA SR-IOV driver (handled by sriov-network-operator)
33+
- Support coexistence of DRA SR-IOV and device-plugin SR-IOV
34+
- Live migration of VMs with DRA network devices
35+
36+
## Definition of Users
37+
38+
- **User**: A person who wants to attach SR-IOV network devices to a VM
39+
- **Admin**: A person who manages infrastructure and configures DRA device classes and drivers
40+
- **Developer**: A person familiar with CNCF ecosystem who develops automation using these APIs
41+
42+
## User Stories
43+
44+
- As a user, I want to consume SR-IOV network devices via DRA in my VMs
45+
- As a user, I want to specify custom MAC addresses for DRA-provisioned SR-IOV interfaces
46+
- As an admin, I want to use DRA drivers to manage SR-IOV device allocation with fine-grained control
47+
- As a developer, I want extensible APIs to build automation for DRA-based networking
48+
49+
## Use Cases
50+
51+
### Supported Use Cases
52+
53+
1. SR-IOV network devices where the DRA driver publishes required attributes in device metadata files:
54+
- `resources.kubernetes.io/pciBusID` for SR-IOV VF passthrough
55+
56+
### Future Use Cases
57+
1. Scalable Functions network devices
58+
2. Live migration of VMIs using DRA network devices (will have a VEP amendment)
59+
60+
## Repos
61+
62+
kubevirt/kubevirt
63+
64+
## Design
65+
66+
This design introduces a new feature gate: `NetworkDevicesWithDRA`.
67+
All the API changes will be gated behind this feature gate so as not to break existing functionality.
68+
69+
### API Changes
70+
71+
A new network source type `ResourceClaimNetworkSource` is added to the existing `NetworkSource` type:
72+
73+
```go
74+
// Represents the source resource that will be connected to the vm.
75+
// Only one of its members may be specified.
76+
type NetworkSource struct {
77+
Pod *PodNetwork `json:"pod,omitempty"`
78+
Multus *MultusNetwork `json:"multus,omitempty"`
79+
ResourceClaim *ResourceClaimNetworkSource `json:"resourceClaim,omitempty"`
80+
}
81+
82+
// ResourceClaimNetworkSource represents a network resource requested
83+
// via a Kubernetes ResourceClaim.
84+
type ResourceClaimNetworkSource struct {
85+
// ClaimName references the name of an entry in the
86+
// VMI's spec.resourceClaims[] array.
87+
// +kubebuilder:validation:MinLength=1
88+
ClaimName string `json:"claimName"`
89+
90+
// RequestName specifies which request from the
91+
// ResourceClaim.spec.devices.requests array this network
92+
// source corresponds to.
93+
// +kubebuilder:validation:MinLength=1
94+
RequestName string `json:"requestName"`
95+
}
96+
```
97+
98+
The VMI must also include the resource claim in `spec.resourceClaims[]` (consistent with GPU and HostDevice DRA usage).
99+
100+
### Status Reporting
101+
102+
For consistency with GPUs and HostDevices, DRA-provisioned network devices populate the same `vmi.status.deviceStatus.hostDeviceStatuses[]` array. The DRA controller in virt-controller:
103+
104+
1. Identifies networks with `resourceClaim` source type
105+
2. Extracts device information from the allocated ResourceClaim and ResourceSlice
106+
3. Populates `hostDeviceStatuses` with network name and allocated device attributes (PCI address)
107+
108+
The status entry name matches the network name from `spec.networks[].name`, allowing virt-launcher to correlate the network configuration with its allocated DRA device.
109+
110+
The detailed mechanism for extracting device information from Pod status, ResourceClaim, and ResourceSlice follows the same approach described in VEP #10.
111+
112+
### SR-IOV Integration
113+
114+
When a network interface has `sriov` binding and references a network with `resourceClaim` source:
115+
116+
1. The network admitter validates that exactly one network source type (pod, multus, or resourceClaim) is specified
117+
2. Virt-controller adds the resource claim to the virt-launcher pod spec via `WithNetworksDRA()` render option
118+
3. The DRA controller populates `vmi.status.deviceStatus` with the PCI address from the ResourceSlice
119+
4. Virt-launcher reads the PCI address from device status and generates the appropriate libvirt hostdev XML (at [`generateConverterContext`](https://github.com/kubevirt/kubevirt/blob/ffa91c8156fecf1d91dd865c6197865a0a3e525b/pkg/virt-launcher/virtwrap/manager.go#L1163), alongside the existing `sriov.CreateHostDevices` call), identical to traditional Multus-based SR-IOV
120+
121+
This approach provides clean separation: DRA handles device provisioning, KubeVirt networks API handles configuration.
122+
123+
**Important:** Traditional Multus-based SR-IOV (using `multus` network source) and DRA-based SR-IOV (using `resourceClaim` network source) are **mutually exclusive per VM**. A single VMI should not mix both approaches. The existing Multus-based SR-IOV API remains fully supported and unchanged.
124+
125+
### Custom MAC Address Support
126+
127+
To support custom MAC addresses for DRA-based SR-IOV networks, KubeVirt will annotate the virt-launcher pod with requested MAC addresses. The MAC address will be taken from the existing `spec.domain.devices.interfaces[].macAddress` field:
128+
129+
```
130+
kubevirt.io/dra-networks: '[{"claimName":"sriov","requestName":"vf","mac":"de:ad:00:00:be:ef"}]'
131+
```
132+
133+
This preserves the structure of `k8s.v1.cni.cncf.io/networks`, but for claimName/requestName instead of NAD.
134+
135+
The SR-IOV DRA driver reads this annotation and passes the claim/request identifier along with the MAC address to the SR-IOV CNI, ensuring the network interface is configured with the specified MAC address.
136+
137+
**Design Rationale:** The annotation-based approach was chosen because it solves the case where ResourceClaim/ResourceClaimTemplate is created by the admin (not by KubeVirt). Since this approach handles the more complex admin-created claim scenario, it naturally also works for the general case where KubeVirt creates the claims ("auto" mode), providing a unified solution for both scenarios.
138+
139+
### Validation
140+
141+
Webhook validations ensure:
142+
1. Networks with `resourceClaim` source have corresponding `sriov` binding interfaces
143+
2. Each network must reference a unique `claimName` + `requestName` combination. No two DRA entities (networks, hostDevices, or GPUs) can share the same tuple, as each interface+network pair must map to exactly one device allocation
144+
3. No mixing of Multus-based and DRA-based SR-IOV in the same VMI.
145+
146+
### Component Changes
147+
148+
**Virt-Controller:**
149+
- Renders virt-launcher pod spec with resource claims from `vmi.spec.resourceClaims[]` referenced by `vmi.spec.networks[].resourceClaim`
150+
- Annotates virt-launcher pod with `kubevirt.io/dra-networks` containing MAC addresses from `spec.domain.devices.interfaces[].macAddress`
151+
152+
**Virt-Launcher:**
153+
- For SR-IOV networks with DRA, virt-launcher uses `vmi.status.deviceStatus` to generate the domain XML instead of Kubevirt's downwardAPI file as in the case of device-plugins
154+
- The `CreateDRAHostDevices()` function generates hostdev XML by:
155+
- Filtering VMI spec interfaces with SRIOV binding that reference networks with resourceClaim source
156+
- Looking up the corresponding VMI status device status entry by network name
157+
- Extracting the PCI address from VMI status device status attributes
158+
- Generating standard libvirt hostdev XML
159+
160+
- **Note:** If the ResourceClaim/ResourceClaimTemplate is allocating more than one device for the request, KubeVirt will consume the first device from the allocated devices
161+
162+
## API Examples
163+
164+
### VMI with DRA SR-IOV Network
165+
166+
```yaml
167+
---
168+
apiVersion: resource.k8s.io/v1
169+
kind: DeviceClass
170+
metadata:
171+
name: sriov.network.example.com
172+
spec:
173+
selectors:
174+
- cel:
175+
expression: device.driver == 'sriov.network.example.com'
176+
---
177+
apiVersion: resource.k8s.io/v1
178+
kind: ResourceClaimTemplate
179+
metadata:
180+
name: sriov-network-claim-template
181+
namespace: default
182+
spec:
183+
spec:
184+
devices:
185+
requests:
186+
- name: sriov-nic-request
187+
exactly:
188+
deviceClassName: sriov.network.example.com
189+
---
190+
apiVersion: kubevirt.io/v1
191+
kind: VirtualMachineInstance
192+
metadata:
193+
name: vmi-sriov-dra
194+
namespace: default
195+
spec:
196+
domain:
197+
devices:
198+
interfaces:
199+
- name: sriov-net
200+
sriov: {}
201+
macAddress: "de:ad:00:00:be:ef"
202+
networks:
203+
- name: sriov-net
204+
resourceClaim:
205+
claimName: sriov-network-claim
206+
requestName: sriov-nic-request
207+
resourceClaims:
208+
- name: sriov-network-claim
209+
resourceClaimTemplateName: sriov-network-claim-template
210+
status:
211+
deviceStatus:
212+
hostDeviceStatuses:
213+
- name: sriov-net
214+
deviceResourceClaimStatus:
215+
name: 0000-05-00-1
216+
resourceClaimName: virt-launcher-vmi-sriov-dra-sriov-network-claim-abc123
217+
attributes:
218+
pciAddress: 0000:05:00.1
219+
---
220+
apiVersion: v1
221+
kind: Pod
222+
metadata:
223+
name: virt-launcher-vmi-sriov-dra
224+
namespace: default
225+
annotations:
226+
kubevirt.io/dra-networks: '[{"claimName":"sriov-network-claim","requestName":"sriov-nic-request","mac":"de:ad:00:00:be:ef"}]'
227+
spec:
228+
containers:
229+
- name: compute
230+
image: virt-launcher
231+
resources:
232+
claims:
233+
- name: sriov-network-claim
234+
request: sriov-nic-request
235+
resourceClaims:
236+
- name: sriov-network-claim
237+
resourceClaimTemplateName: sriov-network-claim-template
238+
status:
239+
resourceClaimStatuses:
240+
- name: sriov-network-claim
241+
resourceClaimName: virt-launcher-vmi-sriov-dra-sriov-network-claim-abc123
242+
```
243+
244+
## Scalability
245+
246+
The DRA controller in virt-controller uses existing shared informers (no additional watch calls) and filters events to relevant status sections. See [VEP #10](../../sig-compute/10-dra-devices/vep.md#scalability) for detailed scalability analysis.
247+
248+
## Update/Rollback Compatibility
249+
250+
- Changes are upgrade compatible
251+
- Rollback works as long as feature gate is disabled
252+
- If the feature is enabled, VMIs using DRA network devices must be deleted and feature gate disabled before attempting rollback
253+
254+
## Functional Testing Approach
255+
256+
- Unit tests with optimum coverage for new code
257+
- New e2e test lane with all current SR-IOV tests using the new API
258+
(excluding migration tests, which will be added when migration is supported)
259+
260+
## Implementation History
261+
262+
- 2026-01-20: Initial design/VEP proposal for SR-IOV Network DRA support
263+
264+
## Graduation Requirements
265+
266+
### Alpha
267+
268+
- Code changes behind `NetworkDevicesWithDRA` feature gate
269+
- Unit tests
270+
- E2E tests with SR-IOV DRA driver (excluding migration)
271+
272+
### Beta
273+
274+
- Evaluate user and driver author experience
275+
- Consider additional use cases if any
276+
- Work with Kubernetes community on standardizing device information injection
277+
- Live migration support for DRA network devices
278+
- Live migration will use CDI/NRI to inject device information as files into each pod (mappings of request/claim to PCI addresses)
279+
- Each virt-launcher reads its pod-specific device file, avoiding conflicts in VMI status
280+
- Might be initially implemented by SR-IOV DRA driver; future Kubernetes support may generalize this (see [kubernetes/enhancements#5606](https://github.com/kubernetes/enhancements/pull/5606))
281+
- Details: https://github.com/k8snetworkplumbingwg/dra-driver-sriov/pull/62
282+
283+
### GA
284+
285+
- Upgrade/downgrade testing
286+
287+
## References
288+
289+
- DRA: https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/
290+
- SR-IOV DRA driver: https://github.com/k8snetworkplumbingwg/dra-driver-sriov
291+
- VEP #10 (DRA devices): /veps/sig-compute/10-dra-devices/vep.md
292+
- Kubernetes DRA device information injection: https://github.com/kubernetes/enhancements/pull/5606

0 commit comments

Comments
 (0)