Clean stale VF representor ports from OVS bridges during switchdev reconciliation#1048
Conversation
|
Thanks for your PR,
To skip the vendors CIs, Maintainers can use one of:
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue where stale VF representor ports remain in OVS bridges after a host reboot, leading to potential errors during new pod creation. The changes enhance the OVS bridge cleanup mechanism to proactively remove these stale representors and the PF uplink during switchdev reconciliation. This ensures a clean state for VF re-creation and improves the robustness of network interface management by correctly identifying and detaching all relevant interfaces. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses the issue of stale VF representor ports in OVS bridges after a host reboot. However, while introducing this cleanup mechanism, the implementation introduces a security regression by trusting interface names provided in the external configuration spec for deletion operations, which could allow an attacker with control over SRIOV policies to delete arbitrary OVS ports. The suggested fix uses trusted information from the local store to perform the cleanup safely. Additionally, there is one critical comment regarding a potential syntax error in a test file.
10b44cb to
52cf61e
Compare
Pull Request Test Coverage Report for Build 22727085437Details
💛 - Coveralls |
| funcLog.Error(err, "RemoveInterfaceFromOVSBridge(): failed to read data from store") | ||
| return fmt.Errorf("failed to read data from store: %v", err) | ||
| } | ||
| var relatedBridges []*sriovnetworkv1.OVSConfigExt |
There was a problem hiding this comment.
why is this change needed ? i dont think its related to what you are trying to fix is it ?
There was a problem hiding this comment.
The old code only checked kc.Uplinks[0], so for multi-uplink bridges (like multiplane setups with more than one PF on each bridge), if the PF being detached was at index 1, the lookup would miss it entirely and skip the cleanup. The refactor iterates all uplinks to find the matching PCI address. Also simplified from a relatedBridges slice to a single brConf pointer with early break — a PCI address can only belong to one bridge, so collecting multiple matches and warning was unnecessary.
| if err := o.deleteInterfaceByName(ctx, dbClient, brConf.Uplinks[0].Name); err != nil { | ||
| // Remove stale VF representor interfaces | ||
| for i := 0; i < numVfs; i++ { | ||
| repName := fmt.Sprintf("%s_%d", pfName, i) |
There was a problem hiding this comment.
note: thats how we name representors via udev in the operator.
There was a problem hiding this comment.
i wonder if its always safe to use the passed in numVFs e.g node state spec changed during reboot
we could list the ports and just delete ports that have this type of name format. not sure if its a good idea just a thought
There was a problem hiding this comment.
Added a code comment referencing the udev naming convention.
Re iterating over numVFs, it is safe because deleteInterfaceByName is a no-op when the interface doesn't exist in OVSDB — so if numVfs is larger than what's actually in the bridge, the extra iterations are harmless. On the other hand, if there were previously more VFs than the currently configured numVfs, the leftover representors won't conflict with the current configuration since they have different names. If the user changes numVfs back, they will get cleaned up. Added a comment in the code explaining this.
…conciliation
After host reboot, VF representor ports remain stuck in OVS bridges if
pods were not deleted before reboot. When the daemon reconciles and
re-creates VFs, the new representors get the same names, causing errors
during new pod creation trying to use the same rail resource.
Extend RemoveInterfaceFromOVSBridge to accept pfName and numVfs to clean
representor interfaces ({pfName}_{vfIndex}) alongside the PF uplink.
Propagate pfName through the setEswitchModeAndNumVFs call chain so
detachPFFromBridge always has the info needed for representor cleanup.
Add a new detachPFFromBridge call in createVFs() after the skip check,
so cleanup only runs when VFs actually need re-creation. Fix multi-uplink
bridge lookup to iterate all uplinks instead of only checking Uplinks[0].
Signed-off-by: Alexander Maslennikov <amaslennikov@nvidia.com>
52cf61e to
720d25a
Compare
|
CI failures don't seem related |
After host reboot, VF representor ports remain stuck in OVS bridges if pods were not deleted before reboot. When the daemon reconciles and re-creates VFs, the new representors get the same names, causing errors during new pod creation trying to use the same rail resource.
Extend RemoveInterfaceFromOVSBridge to accept pfName and numVfs to clean representor interfaces ({pfName}_{vfIndex}) alongside the PF uplink. Propagate pfName through the setEswitchModeAndNumVFs call chain so detachPFFromBridge always has the info needed for representor cleanup.
Add a new detachPFFromBridge call in createVFs() after the skip check, so cleanup only runs when VFs actually need re-creation. Fix multi-uplink bridge lookup to iterate all uplinks instead of only checking Uplinks[0].