fix: getCurrentBridgeState to return all uplinks for grouped bridges#131
Merged
almaslennikov merged 1 commit intoMellanox:network-operator-26.1.xfrom Jan 15, 2026
Merged
Conversation
The getCurrentBridgeState() function was only processing the first uplink (Uplinks[0]) and ignoring the rest. This caused bridges with multiple uplinks (created via groupingPolicy: all) to constantly reconfigure because the current state comparison always showed a mismatch. Additionally, NeedToUpdateBridges was comparing the entire Bridges struct including the GroupingPolicy field. However, GroupingPolicy is a policy directive used during configuration and is not stored in OVS, so it will never appear in the discovered status. This caused another source of constant mismatch detection. Changes: - Modify getCurrentBridgeState() to iterate through all uplinks in knownConfig.Uplinks instead of only processing the first one - Modify NeedToUpdateBridges to only compare OVS configurations, ignoring the GroupingPolicy field - Add test cases for bridge with multiple uplinks - Add test case for partial uplinks when some interfaces are missing - Add test case for groupingPolicy difference being ignored Fixes infinite reconciliation loop where bridge ports kept being removed and re-added on every sync cycle. Signed-off-by: Alexander Maslennikov <amaslennikov@nvidia.com>
|
Thanks for your PR,
To skip the vendors CIs, Maintainers can use one of:
|
Greptile SummaryThis PR fixes an infinite reconciliation loop that occurred when using bridge grouping policies with multiple uplinks. The fix addresses two critical issues: Key Changes:
Impact: Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Operator as SR-IOV Operator
participant Helper as NeedToUpdateBridges
participant OVS as OVS Manager
participant Store as Config Store
participant OVSDB as OVS Database
Note over Operator,OVSDB: Bridge Reconciliation Flow (Before Fix)
Operator->>Helper: Compare spec vs status
Helper->>Helper: DeepEqual(bridgeSpec, bridgeStatus)<br/>including GroupingPolicy
Note over Helper: ❌ Always mismatches because<br/>GroupingPolicy not in status
Helper-->>Operator: Update needed
Operator->>OVS: CreateOVSBridge(conf)
OVS->>Store: GetManagedOVSBridge(name)
Store-->>OVS: knownConfig with all uplinks
OVS->>OVS: getCurrentBridgeState(knownConfig)
OVS->>OVSDB: Query uplinks[0] only
OVSDB-->>OVS: First uplink state
Note over OVS: ❌ Only returns first uplink<br/>Ignores p1, p2, p3
OVS-->>OVS: currentState with 1 uplink
OVS->>OVS: Compare conf vs currentState
Note over OVS: ❌ Mismatch: 4 uplinks vs 1 uplink
OVS->>OVSDB: Reconfigure (remove/add ports)
Note over Operator,OVSDB: Infinite reconciliation loop
Note over Operator,OVSDB: Bridge Reconciliation Flow (After Fix)
Operator->>Helper: Compare spec vs status
Helper->>Helper: DeepEqual(bridgeSpec.OVS, bridgeStatus.OVS)<br/>ignoring GroupingPolicy
Note over Helper: ✅ Correct comparison of<br/>actual OVS config
Helper-->>Operator: No update if OVS matches
Operator->>OVS: CreateOVSBridge(conf)
OVS->>Store: GetManagedOVSBridge(name)
Store-->>OVS: knownConfig with all uplinks
OVS->>OVS: getCurrentBridgeState(knownConfig)
loop For each uplink in knownConfig
OVS->>OVSDB: Query uplink[i]
OVSDB-->>OVS: Uplink state
OVS->>OVS: Append to currentConfig.Uplinks
end
Note over OVS: ✅ Returns all uplinks<br/>(p0, p1, p2, p3)
OVS-->>OVS: currentState with all uplinks
OVS->>OVS: Compare conf vs currentState
Note over OVS: ✅ Match: both have same uplinks
OVS-->>Operator: No reconfiguration needed
Note over Operator,OVSDB: Stable state achieved
|
e0ne
approved these changes
Jan 15, 2026
ef37ee1
into
Mellanox:network-operator-26.1.x
11 of 14 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The getCurrentBridgeState() function was only processing the first uplink (Uplinks[0]) and ignoring the rest. This caused bridges with multiple uplinks (created via groupingPolicy: all) to constantly reconfigure because the current state comparison always showed a mismatch.
Additionally, NeedToUpdateBridges was comparing the entire Bridges struct including the GroupingPolicy field. However, GroupingPolicy is a policy directive used during configuration and is not stored in OVS, so it will never appear in the discovered status. This caused another source of constant mismatch detection.
Changes:
Fixes infinite reconciliation loop where bridge ports kept being removed and re-added on every sync cycle.