Skip to content

Comments

doc: DRA integration#1035

Draft
rollandf wants to merge 1 commit intok8snetworkplumbingwg:masterfrom
rollandf:dra-design
Draft

doc: DRA integration#1035
rollandf wants to merge 1 commit intok8snetworkplumbingwg:masterfrom
rollandf:dra-design

Conversation

@rollandf
Copy link
Contributor

No description provided.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions github-actions bot added the docs label Feb 17, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @rollandf, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds a new design document detailing the integration of Kubernetes Dynamic Resource Allocation (DRA) into the SR-IOV Network Operator. The primary goal is to modernize SR-IOV resource management by replacing the existing device plugin framework with DRA, offering more flexible and advanced capabilities for allocating Virtual Functions. The document describes the new workflow, API extensions, and a phased approach to implementation, ensuring a smooth transition and enhanced resource control for SR-IOV devices.

Highlights

  • Dynamic Resource Allocation (DRA) Integration: Introduced a comprehensive design document outlining the integration of Kubernetes Dynamic Resource Allocation (DRA) framework into the SR-IOV Network Operator. This proposes using the dra-driver-sriov as an alternative to the traditional device plugin for exposing and allocating SR-IOV Virtual Functions.
  • Opt-in Feature Flag: The DRA integration will be an opt-in feature controlled by a new dynamicResourceAllocation feature flag within the SriovOperatorConfig CR, ensuring backward compatibility and gradual adoption.
  • Automated Resource Filter Generation: The operator will automatically generate SriovResourceFilter CRs from SriovNetworkNodePolicy CRs when DRA mode is enabled, providing advanced filtering capabilities and synchronization with node configuration states.
  • New Workflow for Resource Claims: Workloads will now request SR-IOV VFs using native Kubernetes ResourceClaimTemplate objects, moving away from device plugin resource requests and leveraging DRA's advanced resource modeling.
  • Detailed Implementation Plan: The document outlines a phased implementation plan, covering core infrastructure, policy-to-filter conversion, integration & testing, documentation, and optional advanced features like extended resource allocation.
Changelog
  • doc/design/DRA-integration.md
    • Added a new design document detailing the integration of Kubernetes Dynamic Resource Allocation (DRA) into the SR-IOV Network Operator.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a very comprehensive and well-structured design document for integrating DRA into the SR-IOV Network Operator. The proposal is detailed, covering motivation, API changes, implementation phases, and testing. The phased approach is sensible, and the synchronization mechanism to prevent race conditions is well-thought-out. I've added a few comments to point out some minor inconsistencies and potential typos in the document that should be addressed for clarity.

Comment on lines +957 to +959
expression: |
device.driver == "sriov.k8snetworkplumbingwg.io" &&
device.attributes["sriovnetwork.k8snetworkplumbingwg.io"].resourceName == "intel_nic"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The CEL expression for the DeviceClass selector uses device.driver == "sriov.k8snetworkplumbingwg.io". This is inconsistent with the driver name sriovnetwork.k8snetworkplumbingwg.io used in other parts of the document (e.g., lines 89 and 566). Please ensure the driver name is consistent throughout the design to avoid implementation errors.

Suggested change
expression: |
device.driver == "sriov.k8snetworkplumbingwg.io" &&
device.attributes["sriovnetwork.k8snetworkplumbingwg.io"].resourceName == "intel_nic"
expression: |
device.driver == "sriovnetwork.k8snetworkplumbingwg.io" &&
device.attributes["sriovnetwork.k8snetworkplumbingwg.io"].resourceName == "intel_nic"

Comment on lines +1020 to +1021
`device.driver == "sriov.k8snetworkplumbingwg.io" && ` +
`device.attributes["sriovnetwork.k8snetworkplumbingwg.io"].resourceName == "%s"`,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The CEL expression in generateDeviceClassSelector uses device.driver == "sriov.k8snetworkplumbingwg.io". This is inconsistent with the driver name sriovnetwork.k8snetworkplumbingwg.io defined for the basic DeviceClass on lines 89 and 566. Please ensure the driver name is used consistently throughout the design to avoid implementation errors.

Suggested change
`device.driver == "sriov.k8snetworkplumbingwg.io" && ` +
`device.attributes["sriovnetwork.k8snetworkplumbingwg.io"].resourceName == "%s"`,
`device.driver == "sriovnetwork.k8snetworkplumbingwg.io" && ` +
`device.attributes["sriovnetwork.k8snetworkplumbingwg.io"].resourceName == "%s"`,

Comment on lines +8 to +9
creation-date: 11-02-2026
last-updated: 11-02-2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The creation-date and last-updated are set to 2026. This appears to be a typo and should probably be the current year to avoid confusion.

Suggested change
creation-date: 11-02-2026
last-updated: 11-02-2026
creation-date: 11-02-2024
last-updated: 11-02-2024

Comment on lines +1109 to +1112
11. Create ResourceClaim referencing the basic DeviceClass
12. Create pod with ResourceClaim and verify VF allocation
11. Test CNI integration with DRA-allocated VFs
12. Test both kernel driver and VFIO modes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The list of integration tests has duplicate numbering. Items 11. and 12. appear twice. Please renumber the list correctly.

Suggested change
11. Create ResourceClaim referencing the basic DeviceClass
12. Create pod with ResourceClaim and verify VF allocation
11. Test CNI integration with DRA-allocated VFs
12. Test both kernel driver and VFIO modes
11. Create ResourceClaim referencing the basic DeviceClass
12. Create pod with ResourceClaim and verify VF allocation
13. Test CNI integration with DRA-allocated VFs
14. Test both kernel driver and VFIO modes

Comment on lines +1165 to +1171
3. **Advanced Filtering**
- Create user-managed `SriovResourceFilter` CRs with advanced criteria
- Verify they coexist with auto-generated filters
- Deploy pods requesting specific resource types
- Verify correct VF allocation based on filters

3. **Migration Scenario**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The E2E Tests section has two items numbered as 3.. The "Migration Scenario" should likely be numbered 4., and subsequent items renumbered accordingly.

Suggested change
3. **Advanced Filtering**
- Create user-managed `SriovResourceFilter` CRs with advanced criteria
- Verify they coexist with auto-generated filters
- Deploy pods requesting specific resource types
- Verify correct VF allocation based on filters
3. **Migration Scenario**
3. **Advanced Filtering**
- Create user-managed `SriovResourceFilter` CRs with advanced criteria
- Verify they coexist with auto-generated filters
- Deploy pods requesting specific resource types
- Verify correct VF allocation based on filters
4. **Migration Scenario**

@coveralls
Copy link

coveralls commented Feb 17, 2026

Pull Request Test Coverage Report for Build 22132182544

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 11 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.03%) to 62.897%

Files with Coverage Reduction New Missed Lines %
pkg/daemon/status.go 5 70.83%
controllers/sriovnetworkpoolconfig_controller.go 6 64.71%
Totals Coverage Status
Change from base Build 22102567607: 0.03%
Covered Lines: 9305
Relevant Lines: 14794

💛 - Coveralls

Signed-off-by: Fred Rolland <frolland@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants