Commit 0074ae3
fix(controller): defer subnet processing until vlan is fully ready (#6527)
* fix(controller): clear subnet IP status when vlan conflict is detected
When two VLANs with the same ID are created on the same provider network,
a race condition causes the conflict subnet to retain its IP range in
status. This happens because:
1. The subnet handler can process the subnet before the vlan handler
marks the vlan as conflicting (informer cache propagation delay of
~3ms), allowing IPAM allocation and IP status to be set.
2. When the vlan handler detects a conflict, it returns early without
re-enqueuing associated subnets, so the subnet is never re-validated.
3. Even when the subnet is re-processed and detects the vlan conflict,
patchSubnetStatus serializes the full status (including the stale IP
range) without clearing IP fields.
Fix all three issues:
- In handleAddVlan/handleUpdateVlan: re-enqueue subnets referencing the
conflicting vlan so they can re-validate.
- In handleAddOrUpdateSubnet: when vlan validation fails, remove the
subnet from IPAM and clear all IP status fields before patching.
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
* fix(controller): address review feedback for vlan conflict handling
Address Copilot review comments:
1. Prevent queue churn: only re-enqueue subnets when vlan conflict
status transitions from false to true (not on every retry). Save
wasConflict before checkVlanConflict and gate the re-enqueue.
2. Scope IPAM cleanup: introduce errVlanConflict sentinel error in
validateSubnetVlan and only clear IPAM/IP status fields when
errors.Is(err, errVlanConflict) is true. Transient lister errors
no longer trigger IPAM deletion.
3. Add unit tests: Test_validateSubnetVlan_conflict verifies the
sentinel error is correctly returned for conflict vs normal vs
missing vlans. Test_handleAddOrUpdateSubnet_clearsIPStatusOnVlanConflict
verifies IPAM removal and status field clearing on conflict.
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
* refactor: revert vlan.go changes, keep minimal subnet-only fix
Remove vlan handler re-enqueue logic — controller logs confirm subnet2
is already re-processed through the normal error retry mechanism (the
reconcileVlan check at subnet.go:1687 detects conflict and returns
error, triggering requeue). The only fix needed is clearing IP status
fields when the subnet handler detects a vlan conflict on re-processing.
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
* fix(controller): defer subnet processing until vlan is fully ready
Instead of cleaning up stale IPAM state after a race condition, prevent
the race entirely: do not process a subnet until its vlan has been fully
handled by the vlan controller.
A newly created vlan has Status.Conflict=false (zero value), which is
indistinguishable from a processed non-conflicting vlan. Use
pn.Status.Vlans as a "vlan ready" signal — it is only populated after
handleAddVlan completes successfully (including the conflict check).
In validateSubnetVlan, after confirming the vlan is not conflicting,
verify it appears in the provider network's Status.Vlans. If not, the
vlan has not been fully processed yet, so return an error to defer
subnet processing until the vlan handler re-enqueues it.
In handleAddVlan, re-enqueue subnets referencing a conflicting vlan so
they can see the updated conflict status and be properly rejected.
This replaces the previous sentinel-error + IPAM-cleanup approach with
a simpler ordering guarantee that eliminates the race window entirely.
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
* fix(controller): address review - silent requeue for unready vlan
- Introduce errVlanNotReady sentinel: when the vlan hasn't been fully
processed yet (empty provider, provider network not found, or vlan
not in pn.Status.Vlans), return errVlanNotReady instead of a generic
error. handleAddOrUpdateSubnet checks for this and requeues silently
without patching subnet status as ValidateSubnetVlanFailed, avoiding
misleading Warning events for a normal transient condition.
- Treat empty vlan.Spec.Provider as "not ready" (vlan handler hasn't
run yet to default the provider) rather than falling back to
DefaultProviderName.
- Add unit test for empty-provider vlan case.
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
* fix(controller): maintain pn.Status.Vlans in handleUpdateVlan
handleUpdateVlan did not update the provider network's Status.Vlans
list after a successful conflict check. This caused a regression where
validateSubnetVlan's new pn.Status.Vlans readiness check blocked
subnet processing after a vlan provider change — the vlan was not
registered in the new provider network's Vlans list, so all associated
subnets were stuck in "vlan not ready" state.
Add the same pn.Status.Vlans maintenance logic that handleAddVlan
already has: after checkVlanConflict passes, ensure the vlan appears
in its provider network's Status.Vlans.
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
---------
Signed-off-by: Mengxin Liu <liumengxinfly@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent edd3de2 commit 0074ae3
File tree
4 files changed
+159
-1
lines changed- pkg/controller
4 files changed
+159
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
83 | 84 | | |
84 | 85 | | |
85 | 86 | | |
| |||
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
161 | 169 | | |
162 | 170 | | |
163 | 171 | | |
| |||
192 | 200 | | |
193 | 201 | | |
194 | 202 | | |
| 203 | + | |
195 | 204 | | |
196 | 205 | | |
197 | 206 | | |
| |||
223 | 232 | | |
224 | 233 | | |
225 | 234 | | |
| 235 | + | |
226 | 236 | | |
227 | 237 | | |
228 | 238 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
166 | 171 | | |
167 | 172 | | |
168 | 173 | | |
| |||
180 | 185 | | |
181 | 186 | | |
182 | 187 | | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
183 | 207 | | |
184 | 208 | | |
185 | 209 | | |
| |||
491 | 515 | | |
492 | 516 | | |
493 | 517 | | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
494 | 524 | | |
495 | 525 | | |
496 | 526 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
604 | 604 | | |
605 | 605 | | |
606 | 606 | | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
125 | 131 | | |
126 | 132 | | |
127 | 133 | | |
| |||
136 | 142 | | |
137 | 143 | | |
138 | 144 | | |
139 | | - | |
| 145 | + | |
140 | 146 | | |
141 | 147 | | |
142 | 148 | | |
| |||
207 | 213 | | |
208 | 214 | | |
209 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
210 | 234 | | |
211 | 235 | | |
212 | 236 | | |
| |||
0 commit comments