Skip to content

address illegal SR-IOV VF MTU configuration#213

Merged
k8s-ci-robot merged 3 commits into
kubernetes-sigs:mainfrom
kanlkan:fix/vf-mtu
Jun 4, 2026
Merged

address illegal SR-IOV VF MTU configuration#213
k8s-ci-robot merged 3 commits into
kubernetes-sigs:mainfrom
kanlkan:fix/vf-mtu

Conversation

@kanlkan
Copy link
Copy Markdown
Member

@kanlkan kanlkan commented Jun 1, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

MTU can be configured via opaque.parameters.interface.mtu.
However, for SR-IOV VFs, it is currently possible to set an MTU larger than the parent PF’s MTU due to the lack of validation.

Which issue(s) this PR is related to:

Fixes #178

Special notes for your reviewer:

The example output below shows the case where validation fails. An error is logged, and the Pod remains in the ContainerCreating state.

Events:
  Type     Reason                         Age   From               Message
  ----     ------                         ----  ----               -------
  Normal   Scheduled                      8s    default-scheduler  Successfully assigned default/pod-a-2-mlnx-nomac to eagle04
  Warning  FailedPrepareDynamicResources  7s    kubelet            Failed to prepare dynamic resources: prepare dynamic resources: NodePrepareResources failed for ResourceClaim vf-rc-a-2: claim 21c6d996-0dfe-43d7-a88d-846db605d4d8 contain errors: requested MTU 1600 for SR-IOV VF ens1f0v1 exceeds parent PF ens1f0np0 MTU 1500
kanda-masaharu@eagle04:~/work/repo/dci-community-team/k8s_demo/sriov_vf_test_dranet/pods$ k get pod pod-a-2-mlnx-nomac
NAME                 READY   STATUS              RESTARTS   AGE
pod-a-2-mlnx-nomac   0/1     ContainerCreating   0          55s

Does this PR introduce a user-facing change?

Fixed an issue where SR-IOV VF MTU could be configured larger than the MTU of its parent PF. Such configurations are now rejected.

Signed-off-by: Masaharu Kanda <kanlkan.naklnak@gmail.com>
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 1, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Jun 1, 2026

Deploy Preview for dranet canceled.

Name Link
🔨 Latest commit 5f8645a
🔍 Latest deploy log https://app.netlify.com/projects/dranet/deploys/6a21111fa07d2c000883859e

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 1, 2026
@aojea aojea requested a review from Copilot June 1, 2026 11:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds validation to prevent configuring an SR-IOV Virtual Function (VF) MTU larger than its parent Physical Function (PF) MTU, avoiding invalid network setups that can leave pods stuck during resource preparation.

Changes:

  • Add sysfs helper(s) to resolve a VF’s parent PF interface name.
  • Add unit tests for PF-name resolution via a temporary sysfs-like directory tree.
  • Enforce VF MTU ≤ PF MTU during NodePrepareResources / claim preparation (rejecting invalid claims).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/inventory/sysfs.go Adds PF interface lookup helper for a given VF using sysfs.
pkg/inventory/sysfs_test.go Adds tests covering PF lookup success and failure scenarios.
pkg/driver/dra_hooks.go Adds claim-time MTU validation for SR-IOV VFs against parent PF MTU.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/driver/dra_hooks.go Outdated
Comment on lines +278 to +294
// For SR-IOV VFs, check that the requested MTU does not exceed the parent PF's MTU.
// If it does, log an error and fail the Pod creation to avoid silent misconfiguration.
if deviceCfg.NetworkInterfaceConfigInPod.Interface.MTU != nil {
if pfName, err := inventory.GetPFInterfaceName(ifName); err == nil {
if pfLink, err := nlHandle.LinkByName(pfName); err == nil {
pfMTU := pfLink.Attrs().MTU
requestedMTU := int(*deviceCfg.NetworkInterfaceConfigInPod.Interface.MTU)
if requestedMTU > pfMTU {
klog.Errorf("requested MTU %d for SR-IOV VF %s exceeds parent PF %s MTU %d",
requestedMTU, ifName, pfName, pfMTU)
errorList = append(errorList, fmt.Errorf("requested MTU %d for SR-IOV VF %s exceeds parent PF %s MTU %d",
requestedMTU, ifName, pfName, pfMTU))
continue
}
}
}
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fix the behavior when an error happened for each branch of this validation. b4c5e84

Comment thread pkg/driver/dra_hooks.go Outdated
Comment on lines +278 to +282
// For SR-IOV VFs, check that the requested MTU does not exceed the parent PF's MTU.
// If it does, log an error and fail the Pod creation to avoid silent misconfiguration.
if deviceCfg.NetworkInterfaceConfigInPod.Interface.MTU != nil {
if pfName, err := inventory.GetPFInterfaceName(ifName); err == nil {
if pfLink, err := nlHandle.LinkByName(pfName); err == nil {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an unit test for this validation. b4c5e84

Comment thread pkg/driver/dra_hooks.go Outdated
Comment thread pkg/driver/dra_hooks.go Outdated
Signed-off-by: Masaharu Kanda <kanlkan.naklnak@gmail.com>
@kanlkan kanlkan requested a review from tamilmani1989 June 2, 2026 05:51
@kanlkan
Copy link
Copy Markdown
Member Author

kanlkan commented Jun 2, 2026

The e2e IPv4 test failed, but it does not seem to be related to my changes.
I would like to re-run this test, but I do not have permission to trigger it manually. The only workaround I can think of is to push an empty commit, but that is not ideal because it would make the commit history less clean.
Is there another way to re-run the test?

@gauravkghildiyal
Copy link
Copy Markdown
Member

gauravkghildiyal commented Jun 2, 2026

Try retest? (Edit: Doesn't work, tests don't use prow)

/retest

@kanlkan
Copy link
Copy Markdown
Member Author

kanlkan commented Jun 2, 2026

@gauravkghildiyal
Thank you for the advice. Your retest command worked perfectly.
I had previously tried using retest on another PR, but it didn't seem to trigger a re-run, so I didn't try it this time.
I'll make use of it in the future when tests fail due to transient or unrelated issues.

Copy link
Copy Markdown
Member

@gauravkghildiyal gauravkghildiyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kanlkan. Minor suggestions.

Comment thread pkg/driver/dra_hooks.go Outdated
if deviceCfg.NetworkInterfaceConfigInPod.Interface.MTU != nil {
pfName, err := inventory.GetPFInterfaceName(ifName)
if err != nil {
// Not an SR-IOV VF, or the parent PF cannot be determined. This is
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we only perform the check if this is an actual VF? We have an isSriovVf function to check that.

Copy link
Copy Markdown
Member Author

@kanlkan kanlkan Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback. I've added that check. 5f8645a

Comment thread pkg/driver/dra_hooks_test.go Outdated
if err == nil {
t.Fatalf("validateVFMTU() expected error, got nil")
}
want := fmt.Sprintf("requested MTU %d for SR-IOV VF eth1 exceeds parent PF eth0 MTU %d", tc.requestedMTU, tc.pfMTU)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: valudateVFMTU is a simple function with only one single kind of error. Let's not try to unnecessarily validate the return string. We can simplify the logic to check if err != tc.WantErr

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback. I simplyfied the test. 5f8645a

@gauravkghildiyal gauravkghildiyal self-assigned this Jun 4, 2026
@gauravkghildiyal
Copy link
Copy Markdown
Member

Please also add a release note

Signed-off-by: Masaharu Kanda <kanlkan.naklnak@gmail.com>
@gauravkghildiyal
Copy link
Copy Markdown
Member

@kanlkan Please remember a release note as well :) (Unhold once done)

/lgtm
/approve
/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jun 4, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gauravkghildiyal, kanlkan, tamilmani1989

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2026
@kanlkan
Copy link
Copy Markdown
Member Author

kanlkan commented Jun 4, 2026

Please remember a release note as well :) (Unhold once done)

@gauravkghildiyal
Sorry. I added it.

@gauravkghildiyal
Copy link
Copy Markdown
Member

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 4, 2026
@k8s-ci-robot k8s-ci-robot merged commit 4fcedee into kubernetes-sigs:main Jun 4, 2026
13 checks passed
@kanlkan kanlkan deleted the fix/vf-mtu branch June 5, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing validation for SR-IOV VF MTU exceeding parent PF MTU

5 participants