Conversation
…verwrite calcSubnetStatusIP previously used subnet.Status.Bytes() which serialized the entire SubnetStatus and patched all fields to etcd. This caused a race condition where handleUpdateSubnetStatus could overwrite U2OInterconnectionVPC with stale data from its informer cache, leading to flaky e2e test failures in "should support underlay to overlay subnet interconnection". The race condition occurs when: 1. handleAddOrUpdateSubnet sets U2OInterconnectionVPC and patches status 2. handleUpdateSubnetStatus retries (from IP inconsistency requeue), reads stale cache without U2OInterconnectionVPC, and calcSubnetStatusIP overwrites all status fields including U2OInterconnectionVPC="" Fix by using a targeted JSON merge patch that only includes the 8 IP-related fields, leaving non-IP fields like U2OInterconnectionVPC untouched. Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @oilbeater, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical race condition within the subnet status update mechanism. By switching from a full status overwrite to a precise, field-specific JSON patch, it prevents unintended data loss and ensures that different controllers can update their respective parts of the subnet status without interfering with each other. This change significantly improves the reliability of subnet interconnection features and related end-to-end tests. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses a race condition in calcSubnetStatusIP by replacing a full status patch with a targeted JSON merge patch for IP-related fields. This change effectively prevents other status fields, such as U2OInterconnectionVPC, from being overwritten by stale cache data. The fix is well-implemented. I have provided one suggestion to improve the code's conciseness.
| ipStatusPatch := struct { | ||
| Status struct { | ||
| V4AvailableIPs float64 `json:"v4availableIPs"` | ||
| V4AvailableIPRange string `json:"v4availableIPrange"` | ||
| V4UsingIPs float64 `json:"v4usingIPs"` | ||
| V4UsingIPRange string `json:"v4usingIPrange"` | ||
| V6AvailableIPs float64 `json:"v6availableIPs"` | ||
| V6AvailableIPRange string `json:"v6availableIPrange"` | ||
| V6UsingIPs float64 `json:"v6usingIPs"` | ||
| V6UsingIPRange string `json:"v6usingIPrange"` | ||
| } `json:"status"` | ||
| }{} | ||
| ipStatusPatch.Status.V4AvailableIPs = v4availableIPs | ||
| ipStatusPatch.Status.V4AvailableIPRange = v4AvailableIPStr | ||
| ipStatusPatch.Status.V4UsingIPs = v4UsingIPs | ||
| ipStatusPatch.Status.V4UsingIPRange = v4UsingIPStr | ||
| ipStatusPatch.Status.V6AvailableIPs = v6availableIPs | ||
| ipStatusPatch.Status.V6AvailableIPRange = v6AvailableIPStr | ||
| ipStatusPatch.Status.V6UsingIPs = v6UsingIPs | ||
| ipStatusPatch.Status.V6UsingIPRange = v6UsingIPStr |
There was a problem hiding this comment.
While the use of a struct to create the patch is type-safe, it's a bit verbose. For improved readability and conciseness, you could consider using a map[string]any to construct the patch payload. This is a common pattern for creating JSON patches and avoids the lengthy struct definition and separate field assignments.
| ipStatusPatch := struct { | |
| Status struct { | |
| V4AvailableIPs float64 `json:"v4availableIPs"` | |
| V4AvailableIPRange string `json:"v4availableIPrange"` | |
| V4UsingIPs float64 `json:"v4usingIPs"` | |
| V4UsingIPRange string `json:"v4usingIPrange"` | |
| V6AvailableIPs float64 `json:"v6availableIPs"` | |
| V6AvailableIPRange string `json:"v6availableIPrange"` | |
| V6UsingIPs float64 `json:"v6usingIPs"` | |
| V6UsingIPRange string `json:"v6usingIPrange"` | |
| } `json:"status"` | |
| }{} | |
| ipStatusPatch.Status.V4AvailableIPs = v4availableIPs | |
| ipStatusPatch.Status.V4AvailableIPRange = v4AvailableIPStr | |
| ipStatusPatch.Status.V4UsingIPs = v4UsingIPs | |
| ipStatusPatch.Status.V4UsingIPRange = v4UsingIPStr | |
| ipStatusPatch.Status.V6AvailableIPs = v6availableIPs | |
| ipStatusPatch.Status.V6AvailableIPRange = v6AvailableIPStr | |
| ipStatusPatch.Status.V6UsingIPs = v6UsingIPs | |
| ipStatusPatch.Status.V6UsingIPRange = v6UsingIPStr | |
| ipStatusPatch := map[string]any{ | |
| "status": map[string]any{ | |
| "v4availableIPs": v4availableIPs, | |
| "v4availableIPrange": v4AvailableIPStr, | |
| "v4usingIPs": v4UsingIPs, | |
| "v4usingIPrange": v4UsingIPStr, | |
| "v6availableIPs": v6availableIPs, | |
| "v6availableIPrange": v6AvailableIPStr, | |
| "v6usingIPs": v6UsingIPs, | |
| "v6usingIPrange": v6UsingIPStr, | |
| }, | |
| } |
…verwrite (#6350) calcSubnetStatusIP previously used subnet.Status.Bytes() which serialized the entire SubnetStatus and patched all fields to etcd. This caused a race condition where handleUpdateSubnetStatus could overwrite U2OInterconnectionVPC with stale data from its informer cache, leading to flaky e2e test failures in "should support underlay to overlay subnet interconnection". The race condition occurs when: 1. handleAddOrUpdateSubnet sets U2OInterconnectionVPC and patches status 2. handleUpdateSubnetStatus retries (from IP inconsistency requeue), reads stale cache without U2OInterconnectionVPC, and calcSubnetStatusIP overwrites all status fields including U2OInterconnectionVPC="" Fix by using a targeted JSON merge patch that only includes the 8 IP-related fields, leaving non-IP fields like U2OInterconnectionVPC untouched. Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
calcSubnetStatusIP()where it usedsubnet.Status.Bytes()to patch the entire SubnetStatus, causinghandleUpdateSubnetStatusto overwriteU2OInterconnectionVPCwith stale cache dataU2OInterconnectionVPC,U2OInterconnectionIP, andconditionsuntouchedU2OInterconnectionVPCto be setRoot Cause
The race condition occurs after a controller restart when U2O is re-enabled:
handleUpdateSubnetStatusrequeuedhandleAddOrUpdateSubnetruns, setsU2OInterconnectionVPC = vpc.Status.Router, patches status ✓handleUpdateSubnetStatusretries, reads subnet from stale informer cache (withoutU2OInterconnectionVPC) →calcSubnetStatusIPpatches the entire status includingU2OInterconnectionVPC = ""→ overwrites the correct value ✗U2OInterconnectionVPCremains empty → timeoutTest plan
make lintpasses (0 issues)go build ./pkg/controller/...compilesgo test ./pkg/controller/...passes🤖 Generated with Claude Code