Add pod collision detection#612
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/hold still running local tests to make sure we don't introduce any flakes |
There was a problem hiding this comment.
Code Review
This pull request renames the vmicollision package to maccollision and introduces a new PodReconciler to detect MAC address collisions for Pods, including cross-type collisions between Pods and Virtual Machine Instances (VMIs). The PoolManager logic was updated to allow duplicate MAC allocations, delegating collision detection and reporting to the controller layer. Feedback includes removing an unused logger parameter and refactoring duplicated logic for finding running pods into a shared helper function to improve maintainability.
ed5d8ee to
b2f79e3
Compare
|
Change: Address gemini review and lint issues, |
Prepare the collision detection package for supporting Pod MAC collision detection alongside the existing VMI support. - Rename pkg/controller/vmicollision/ to pkg/controller/maccollision/ - Rename VMICollisionReconciler to VMIReconciler - Update controller name, log names, and imports This commit is a pure refactor with no logic changes. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
… check Add the building blocks needed for Pod collision detection: - IndexPodByMAC using vendored multus ParsePodNetworkAnnotation, with deduplication and normalization; returns nil when network-status annotation is absent (multus hasn't processed the Pod yet) - StripPodForCollisionDetection cache transform keeping only metadata (Name, Namespace, UID, Labels, Annotations) and Status.Phase - IsKubevirtOwned to identify virt-launcher Pods via the kubevirt.io label, matching kubevirt's own identification pattern Register StripPodForCollisionDetection as a cache transform for Pods in the manager. The existing Pod controller (pkg/controller/pod) reads only metadata and annotations from the cache, which are preserved by the transform. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
Create new controller for detecting MAC collisions among running Pods: - PodReconciler with Client and PoolManager - SetupPodControllerWithManager() registers Pod MAC address indexer - Basic Reconcile() skeleton with NotFound, virt-launcher, and namespace checks - Placeholder for collision detection logic (to be implemented) Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
Fill in PodReconciler with MAC collision detection: query field index for colliding Pods, filter by managed namespaces and non-kubevirt ownership, update collision map, and emit events matching the VMI controller pattern. Currently handles Pod-Pod collisions only; cross-type Pod-VMI collision detection will be added in subsequent commits. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
Add podCollisionRelevantChanges() predicate filtering Pod events by phase change, network-attachment annotation change, or network-status annotation appearance. Build the controller with source.Kind watch using this predicate, and register it via add_podcollision.go init function. Add PodReconciler unit tests covering early-return paths, collision detection, event emission, and filtering of virt-launcher / unmanaged pods. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
Extend PodReconciler to also query the VMI MAC index when kubevirt is enabled, detecting collisions between Pods and running VMIs. Events are emitted on colliding Pods; VMI-side events are handled by the VMI controller to avoid duplication. Add IsKubevirtEnabled to PoolManagerInterface and cross-type unit tests covering Pod-VMI detection, filtering, and kubevirt-disabled path. Extract shared VMI candidate helpers (listVMIsByMAC, listRunningVMIsByMAC, listRunningVMIsByMACWithExcludeUID, filterManagedVMIs) and use controller-specific filterVMIsWithMAC wrappers. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
Extend VMIReconciler to also query the Pod MAC index, detecting collisions between VMIs and running Pods. Filter out virt-launcher pods and unmanaged namespaces. Events emitted on colliding VMIs; Pod-side events are handled by the Pod controller. Use ObjectReference.Key() for type-prefixed collision messages (vmi/ns/name, pod/ns/name). Add cross-type unit tests and update existing VMI-VMI test expectations. Update E2E test helper to match new message format. Extract shared Pod candidate helpers (listRunningPodsWithMAC, excludePodUID, filterCollisionCandidatePods) and use controller-specific call paths. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
This commit adds cross-type secondary watch, using the TypedEnqueueRequestsFromMapFunc feature [0] on controller-runtime. Without these secondary watches, each controller only reconciles when its own resource type changes. This means that when a Pod is deleted, the VMI controller retains stale collision data for the VMIs that previously collided with that Pod, until the VMI happens to be reconciled for some other reason (and vice-versa for VMI deletions affecting the Pod controller). With this change, each controller now adds a secondary watch on the other resource type using handler.EnqueueRequestsFromMapFunc. When a Pod changes, the VMI controller's handler extracts the Pod's MACs and enqueues VMIs sharing those MACs for re-evaluation. The Pod controller does the same in reverse for VMI changes, conditioned on Kubevirt being enabled. [0] https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/handler#TypedEnqueueRequestsFromMapFunc Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
The pod webhook previously rejected pod creation when a requested MAC was already allocated to another pod in the pool map. With the MAC collision detection controller now in place, this blocking is no longer needed — collisions are surfaced via Kubernetes events and metrics. Simplify allocatePodRequestedMac to match the VM webhook's behavior (allocateRequestedVirtualMachineInterfaceMac): just register the MAC via createOrUpdateEntry without any collision check. Remove the now unused macAlreadyBelongsToPodAndNetwork function. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
The VMI collision and migration test AfterEach blocks wait for VMI objects to be deleted but not for the virt-launcher pods they spawned to terminate. This leaves pods in test namespaces that can cause subsequent tests to fail with "There should be no Pods" assertions. Add pod termination waits in both the VMI collision detection AfterEach and the VMI migration test AfterEach. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
b2f79e3 to
af00f3d
Compare
|
Change: Refactor commit message |
|
/hold |
|
/cherry-pick release-0.51 |
|
@RamLavi: once the present PR merges, I will cherry-pick it on top of DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
KubeMacPool now detects MAC collisions on Pods with Multus secondary interfaces (in addition to VMIs) [0]. Add a diagnosis step showing how to find colliding Pods by querying the network-status annotation. Also refactor step 3 to use MAC env var. [0] k8snetworkplumbingwg/kubemacpool#612 Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
Add E2E tests for Pod-Pod and cross-type Pod-VMI MAC collision detection, mirroring the existing VMI collision test structure. Pod-Pod tests cover: inter-Pod collisions (with MAC format variations), out-of-range MACs, 3+ Pods, partial collisions on multi-interface Pods, multiple colliding MACs, and collision clearing on Pod deletion. Cross-type tests cover: detecting collisions between a Pod and a VMI sharing the same MAC, and verifying collision clears when either the Pod or the VMI is deleted. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com> Made-with: Cursor
The certificate recovery tests (certificates_test.go) verify that a deleted secret/caBundle gets regenerated, but do not wait for the webhook server to actually reload and serve the new certificate. Due to kubelet secret volume propagation delay (~60s), subsequent tests can hit "x509: certificate signed by unknown authority" if they call the webhook before the new cert is in place. Add a probe in checkCertLibraryRecovery that creates a VM via the webhook and retries until it succeeds, ensuring the new certificate is fully operational before the test completes. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ram Lavi <ralavi@redhat.com>
af00f3d to
c55432d
Compare
|
Change: use dedicated opt in function to avoid opt in/out race flakes |
This PR introduces a pod MAC collision detection controller, that will replace the current pod MAC collision rejection webhook behavior.
It introduces a controller for pod collisions, and also handles cross-pod-vmi coliision cases.
In order to keep consistency between the cross-collision objects, a secondary watch is added, along with e2e that cover all above scenarios.
What this PR does / why we need it:
last commit belongs to #611
Special notes for your reviewer:
Release note: