Skip to content

Fix pod IP deletion leak and namespace filtering issues#2116

Open
aanchal22 wants to merge 1 commit intomicrosoft:mainfrom
aanchal22:2085/fix-pod-ip-leak-namespace-filtering
Open

Fix pod IP deletion leak and namespace filtering issues#2116
aanchal22 wants to merge 1 commit intomicrosoft:mainfrom
aanchal22:2085/fix-pod-ip-leak-namespace-filtering

Conversation

@aanchal22
Copy link
Contributor

Fix: Pod IP Deletion Leak and Namespace Filtering (#2085)

Pod IPs were leaking in the eBPF filtermap because metadata flags (pod/namespace) were re-evaluated at DELETE time instead of using values recorded at ADD time. This caused mismatches during IP reuse,
namespace filter changes, and annotation changes.

Additionally, namespace exclude filtering was non-functional:

  • appendExcludeList() was empty (not implemented)
  • updateNamespaceLists() used sequential if instead of if/else if
  • nsOfInterest() had incorrect default behavior (returned false instead of true)

Changes:

  • Add metadataTrackingInfo struct to track which metadata was used during ADD
  • Use tracked metadata during DELETE operations regardless of current state
  • Implement appendExcludeList() with proper initial setup via GetAllNamespaces()
  • Fix updateNamespaceLists() if/else logic and nsOfInterest() default
  • Add DELETE event protection and warning logs for deleteIP failures

Related Issue

Fixes #2085

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...).
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • [] I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Testing Completed

  • Built and deployed multi-arch images (amd64 + arm64) successfully
  • go build passes

Additional Notes

  • The metadata tracking overhead is ~24 bytes per tracked IP
  • No breaking changes — default behavior is preserved
  • Windows stubs updated to match new function signatures

Fixes a critical issue causing metrics collection failures

Pod IPs were leaking in the eBPF filtermap due to metadata mismatch between
ADD and DELETE operations. Metadata flags (pod/namespace) were re-evaluated
at DELETE time instead of using values from ADD time, causing mismatches in:
- IP reuse (tracked → untracked namespace)
- Namespace filter changes after pod add
- Annotation changes between add and delete

**Solution:** Track which metadata was used during ADD and use the same
metadata during DELETE, regardless of state changes.

Namespace exclude filtering was broken, causing no metrics collection or eBPF map exhaustion
Problems:
- appendExcludeList() was empty (not implemented)
- updateNamespaceLists() used sequential ifs instead of if/else
- nsOfInterest() had incorrect default behavior
- No protection against spurious DELETE events

**Solution:** Implement namespace filtering.

- Add metadataTrackingInfo struct to track metadata per IP
- Record pod/namespace metadata after successful AddIPs
- Use tracked metadata (not current flags) during DeleteIPs
- Implement appendExcludeList() with proper initial setup
- Fix updateNamespaceLists() if/else logic
- Fix nsOfInterest() default to return true when no filtering
- Add DELETE event protection (check cache before deleting)
- Add GetAllNamespaces() to cache interface
- Add warning logs for deleteIP failures
- Eliminates memory leak (refcount reaches zero)
- Fixes namespace exclude filtering
- Handles IP reuse correctly
- No breaking changes
- Minimal overhead (~24 bytes per tracked IP)

Signed off by: Aanchal Khandelwal (akhandelwal@adobe.com)
@alexcastilio
Copy link
Contributor

This is already being addressed by #2114 and #2118

@aanchal22
Copy link
Contributor Author

aanchal22 commented Mar 16, 2026

This is already being addressed by #2114 and #2118

A few gaps I noticed from my investigation that the two PRs don't cover:

  1. Spurious DELETE event protection
    When a pod DELETE event fires, neither PR verifies the pod is actually gone from the cache before processing. Due to the cache timing issue (cache updated before event published), spurious DELETE events during
    startup or rapid pod churn could remove valid IPs from the filtermap. Our branch added a cache check:
if endpoint := m.daemonCache.GetPodByIP(ip.String()); endpoint != nil {
     // Pod still exists in cache — ignore spurious DELETE
     return
 }
  1. Forced Annotated = true on IP reuse (in handlePodEvent)
    When a pod IP is reused by an untracked pod, the current code forces podCacheEntry.Annotated = true before adding to the delete cache. This causes the delete to use pod-annotation metadata even if the original
    IP was added with namespace metadata, potentially leaving a stale entry. PR fix: Pod IP Deletion Leak in eBPF FilterMap #2114's brute-force "delete with both" approach may mask this, but the forced flag is still incorrect.

  2. Filtermanager observability
    No warning logs are emitted when deleteIP fails in the filtermanager cache (requestor not found, IP not found). This makes it harder to diagnose leak issues in production. Adding warnings to
    pkg/managers/filtermanager/cache.go for these failure paths would improve debuggability.

  3. eBPF filter map size configurability
    The retina_filter eBPF map max_entries is hardcoded at 255. For clusters with many tracked pods, this can cause "no space left on device" errors. I have a separate PR#2117 for making this configurable via Helmvalues / env var.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pod IP Deletion Leak in eBPF Filter and Namespace Filtering Issues in MetricConfiguration CRD

2 participants