fix: Pod IP Deletion Leak in eBPF FilterMap#2114
Conversation
Signed-off-by: Alex Castilio dos Santos <alexsantos@microsoft.com>
Signed-off-by: Alex Castilio dos Santos <alexsantos@microsoft.com>
Retina Code Coverage ReportTotal coverage no changeIncreased diff
Decreased diff
|
Signed-off-by: Alex Castilio dos Santos <alexsantos@microsoft.com>
|
A few gaps I noticed from my investigation that the two PRs don't cover:
|
Description
Fix: Pod IP Deletion Leak in eBPF FilterMap
Problem
Pod IPs accumulate indefinitely in the eBPF filtermap because DELETE operations fail in two ways:
PodCallBackFnguard drops delete events: When a namespace is removed from the include list or a pod annotation is removed,nsOfInterest()andpodOfInterest()both return false — thePodDeletedevent is silently discarded before reachinghandlePodEvent().applyDirtyPodsDeleteuses wrong metadata: Even if the event reaches the delete path,AnnotatedandNamespacedflags are re-evaluated at delete time against current state (not the state when the IP was added). The filtermanager requires matching(Requestor, RequestMetadata)to remove a reference — a delete with the wrong metadata is a no-op.This causes "no space left on device" errors when the eBPF filtermap fills up (255 entries).
Please provide a brief description of the changes made in this pull request.
Fix
Two changes in
pkg/module/metrics/metrics_module.go:Bypass guard for
PodDeletedevents —PodCallBackFnnow skips thensOfInterest/podOfInterestcheck whenevent.Type == EventTypePodDeleted, ensuring delete events always reachhandlePodEvent.Always delete with both metadata types —
applyDirtyPodsDeleteunconditionally issuesDeleteIPswith bothmodulePodReqMetadata("pod") andmoduleReqMetadata("namespace") for every IP in the delete list. The filtermanager'sdeleteIPis a safe no-op when the metadata doesn't exist for an IP, so extra calls cause no harm.Additional minor fix: Replaced
zap.Anywithfmt.Sprintfor[]net.IPlog fields to fixunsupported value typeerrors in log output.Tests
Unit tests added and manual test done.
Manual validation
Scenario 1 — Namespace filter change (annotations mode)
Test:
Logs:
Scenario 2 — Namespace filter change (MetricsConfiguration CRD mode)
Test:
Logs:
Scenario 3 — Pod annotation removed then deleted
Test:
Logs:
Related Issue
#2085
Checklist
git commit -S -s ...). See this documentation on signing commits.Screenshots (if applicable) or Testing Completed
Please add any relevant screenshots or GIFs to showcase the changes made.
Additional Notes
Add any additional notes or context about the pull request here.
Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.