Skip to content

WIP OCPNODE-4560: Migrate OCP-56266 verify kubelet/crio deletes netns when pod deleted#31242

Open
BhargaviGudi wants to merge 1 commit into
openshift:mainfrom
BhargaviGudi:migrate-ocp-56266
Open

WIP OCPNODE-4560: Migrate OCP-56266 verify kubelet/crio deletes netns when pod deleted#31242
BhargaviGudi wants to merge 1 commit into
openshift:mainfrom
BhargaviGudi:migrate-ocp-56266

Conversation

@BhargaviGudi
Copy link
Copy Markdown
Contributor

@BhargaviGudi BhargaviGudi commented Jun 1, 2026

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

  • Helper functions in node_utils.go follow Ginkgo best practices (no assertions, return errors)
  • Uses framework.Logf() for logging instead of assertions in helpers
  • Pod created inline with proper security contexts

Summary by CodeRabbit

  • Tests
    • Added test helpers to extract a pod’s network namespace from node diagnostics and to confirm on-node cleanup.
    • Test exercises pod lifecycle and validates node-side namespace removal.
  • Documentation
    • Updated test documentation to include the new network namespace cleanup test entry.

@openshift-ci-robot
Copy link
Copy Markdown

@BhargaviGudi: No Jira issue with key OCP-56266 exists in the tracker at https://redhat.atlassian.net.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds node utilities to extract a pod's NetNS path from CRI-O journal and verify its removal, plus a Ginkgo e2e that creates a pod, captures its NetNS, deletes the pod, and asserts on-node NetNS cleanup; documents the test in the node README.

Changes

Pod network namespace cleanup validation

Layer / File(s) Summary
Node utils: NetNS extraction and cleanup check
test/extended/node/node_utils.go
Adds regexp import, GetPodNetNs to parse NetNS:<path> from CRI‑O journal output via journalctl, and CheckNetNsCleaned which runs a node-side test -e <netNsPath> and treats absence as success.
Network namespace cleanup e2e test
test/extended/node/node_e2e/netns_cleanup.go, test/extended/node/README.md
New Ginkgo test creates a restricted pod, captures its node and NetNS via the helper, deletes the pod and waits for its object to disappear, then verifies the NetNS path was removed on the node; README updated under openshift/disruptive-longrunning.
sequenceDiagram
  participant E2E as E2E test
  participant NodeExec as Node exec/chroot
  participant CRIO as CRI-O journal
  participant NodeFS as Node filesystem

  E2E->>NodeExec: run journalctl filter for pod NetNS (GetPodNetNs)
  NodeExec->>CRIO: query NetNS entries
  CRIO-->>NodeExec: returns NetNS:<path>
  NodeExec-->>E2E: netns path
  E2E->>Kubelet: delete pod
  E2E->>NodeExec: run 'test -e <netns>' (CheckNetNsCleaned)
  NodeExec->>NodeFS: check path existence
  NodeFS-->>NodeExec: path absent -> success
  NodeExec-->>E2E: return success
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

ready-for-human-review

Suggested reviewers

  • celebdor
  • zaneb
🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning Test pulls image from quay.io (external registry) without [Skipped:Disconnected] tag; will fail in disconnected IPv6-only environments. Add [Skipped:Disconnected] tag to test name or use internal image registry/mirror instead of quay.io/openshifttest/hello-openshift for disconnected compatibility.
✅ Passed checks (14 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Both test names are static string literals with no dynamic content: no fmt.Sprintf, variables, pod/node names, timestamps, UUIDs, or IPs. Test titles are descriptive and stable across runs.
Test Structure And Quality ✅ Passed Single It() block tests one behavior. SetupProject auto-cleans namespace. Proper timeouts specified. All assertions have messages. Helper functions return errors with logging, no assertions.
Microshift Test Compatibility ✅ Passed Test uses standard Kubernetes APIs (Pod, SecurityContext) and MicroShift-aware SetupProject() helper; no OpenShift-specific APIs or multi-node assumptions detected.
Single Node Openshift (Sno) Test Compatibility ✅ Passed Test creates a single pod without scheduling constraints and validates netns cleanup on that node's filesystem. It does not assume multiple nodes or multi-node topology.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds only e2e test code with test pod having no scheduling constraints (no affinity, nodeSelector, or topology requirements). Check applies to operators/deployments, not test code.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations found. All code is contained within Ginkgo test blocks, all logging uses framework.Logf() (properly intercepted), and no process-level stdout writes exist.
No-Weak-Crypto ✅ Passed PR contains no weak cryptography, custom implementations, or unsafe comparisons on secrets or tokens. Code uses standard Go libraries and Kubernetes frameworks only.
Container-Privileges ✅ Passed No privileged configurations found. Pod uses RunAsNonRoot: true, AllowPrivilegeEscalation: false, and drops all capabilities—security best practices.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data (passwords, tokens, API keys, PII, session IDs) exposed in logs. Code safely extracts only filesystem paths from journalctl output without logging raw journal content.
Title check ✅ Passed The title clearly summarizes the main change: migrating a test (OCP-56266) that validates kubelet/CRI-O network namespace cleanup when pods are deleted.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jun 1, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: BhargaviGudi
Once this PR has been reviewed and has the lgtm label, please assign mrunalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from cpmeadors and sairameshv June 1, 2026 07:19
@BhargaviGudi BhargaviGudi changed the title Migrate OCP-56266: verify kubelet/crio deletes netns when pod deleted WIP Migrate OCP-56266: verify kubelet/crio deletes netns when pod deleted Jun 1, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 1, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@BhargaviGudi: No Jira issue with key OCP-56266 exists in the tracker at https://redhat.atlassian.net.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Summary by CodeRabbit

  • Tests
  • Added new E2E test to verify network namespaces are properly cleaned up when pods are deleted
  • Extended testing utilities with helper functions to support network namespace validation and cleanup verification
  • Updated test documentation to reference the new network namespace cleanup test suite entry

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/node/node_e2e/netns_cleanup.go`:
- Around line 102-105: Replace the string match on "not found" with the
Kubernetes API error helper: in the wait.PollUntilContextTimeout callback check
apierrors.IsNotFound(pollErr) instead of strings.Contains(pollErr.Error(), "not
found") (i.e., use k8s.io/apimachinery/pkg/api/errors as apierrors and call
apierrors.IsNotFound(pollErr)); keep the same Pod GET via
oc.KubeClient().CoreV1().Pods(namespace).Get(...) and the e2e.Logf call, and
remove the unused "strings" import after this change.

In `@test/extended/node/node_utils.go`:
- Line 779: The test currently logs raw CRI-O journal output via
framework.Logf("NetNs journal output: %v", netNsStr) which may expose
host/runtime details; replace that call so it does not print the full journal
content — log only the sanitized NetNS identifier/path (the extracted token you
already compute, e.g., netNsToken or netNsPath) or omit the value entirely, and
remove any usage of netNsStr in logs; update the framework.Logf invocation in
node_utils.go to output only the safe identifier instead of netNsStr.
- Around line 805-813: CheckNetNsCleaned currently inspects stdout for the "No
such file or directory" message but ExecOnNodeWithChroot discards stderr; modify
the ExecOnNodeWithChroot invocation in CheckNetNsCleaned so the error text is
captured (e.g., run via shell with stderr redirected into stdout like invoking
"sh -c 'ls -l <netNsPath> 2>&1'" or call a combined-output helper if available)
so the function can reliably detect the "No such file or directory" string from
the returned result; update the call that uses ExecOnNodeWithChroot(nodeName,
"ls", "-l", netNsPath) accordingly.
- Around line 773-774: The code builds a shell pipeline into cmd using raw
podName which allows command/regex injection; change the call that constructs
cmd and the ExecOnNodeWithChroot invocation to pass a safely quoted/escaped
podName (or use grep -F --) instead of raw interpolation. Concretely, update the
cmd creation (the variable named cmd) to use an escaping/quoting function (e.g.,
fmt.Sprintf with %q or otherwise escape shell metacharacters) and prefer grep -F
-- %s to force literal matching, then call ExecOnNodeWithChroot(oc, nodeName,
"/bin/bash", "-c", cmd) with that sanitized value so podName cannot inject extra
shell syntax (target symbols: cmd, podName, ExecOnNodeWithChroot, nodeName).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 46e20e07-b293-4e83-a1cb-1a5460a50299

📥 Commits

Reviewing files that changed from the base of the PR and between 76ed5a8 and 033cec1.

📒 Files selected for processing (3)
  • test/extended/node/README.md
  • test/extended/node/node_e2e/netns_cleanup.go
  • test/extended/node/node_utils.go

Comment thread test/extended/node/node_e2e/netns_cleanup.go
Comment thread test/extended/node/node_utils.go Outdated
Comment thread test/extended/node/node_utils.go Outdated
Comment thread test/extended/node/node_utils.go Outdated
@openshift-ci-robot
Copy link
Copy Markdown

@BhargaviGudi: No Jira issue with key OCP-56266 exists in the tracker at https://redhat.atlassian.net.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Summary by CodeRabbit

  • Tests
  • Added a new E2E test verifying that network namespaces are cleaned up after pods are deleted (covers related OCP issue).
  • Extended testing utilities with helpers to extract pod network namespace information and confirm on-node cleanup.
  • Updated test documentation to reference the new network namespace cleanup test entry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/node/README.md`:
- Line 12: Update the README entry for "node_e2e/netns_cleanup.go" that
currently lists the invalid Jira key "OCP-56266": either replace "OCP-56266"
with the correct tracker ID if known, or remove the tracker ID entirely (leaving
the description "Network namespace cleanup (OCP-56266) - Verifies
kubelet/CRI-O..." adjusted accordingly) so the catalog no longer references the
non-existent key; ensure the line still reads clearly and retains the test name
"node_e2e/netns_cleanup.go" and its description.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8ab06ab2-79c0-4bf7-8487-90c7bb09ad79

📥 Commits

Reviewing files that changed from the base of the PR and between 033cec1 and 07bfcca.

📒 Files selected for processing (3)
  • test/extended/node/README.md
  • test/extended/node/node_e2e/netns_cleanup.go
  • test/extended/node/node_utils.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/extended/node/node_e2e/netns_cleanup.go
  • test/extended/node/node_utils.go

Comment thread test/extended/node/README.md Outdated
@openshift-ci-robot
Copy link
Copy Markdown

@BhargaviGudi: No Jira issue with key OCP-56266 exists in the tracker at https://redhat.atlassian.net.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Summary by CodeRabbit

  • Tests
  • Added a new E2E test that verifies network namespaces are cleaned up after pods are deleted (references related OCP issue).
  • Extended test utilities with helpers to extract a pod’s network namespace from node logs and confirm on-node cleanup.
  • Updated test documentation to include the new network namespace cleanup test entry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/extended/node/node_e2e/netns_cleanup.go (1)

117-119: ⚡ Quick win

Consider polling the on-node cleanup check.

NetNS teardown by kubelet/CRI-O can lag slightly behind removal of the pod object from the API. A single CheckNetNsCleaned call immediately after the pod is NotFound can flake; wrapping it in a short poll/Eventually makes the test resilient to that timing.

♻️ Suggested poll
-	g.By("Verify that the NetNS file has been cleaned up on the node")
-	err = nodeutils.CheckNetNsCleaned(oc, nodeName, netNsPath)
-	o.Expect(err).NotTo(o.HaveOccurred(), "NetNS file was not cleaned up")
+	g.By("Verify that the NetNS file has been cleaned up on the node")
+	err = wait.PollUntilContextTimeout(ctx, 2*time.Second, 1*time.Minute, true, func(ctx context.Context) (bool, error) {
+		if cleanErr := nodeutils.CheckNetNsCleaned(oc, nodeName, netNsPath); cleanErr != nil {
+			e2e.Logf("NetNS not cleaned yet: %v", cleanErr)
+			return false, nil
+		}
+		return true, nil
+	})
+	o.Expect(err).NotTo(o.HaveOccurred(), "NetNS file was not cleaned up")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/node/node_e2e/netns_cleanup.go` around lines 117 - 119, Wrap
the single call to nodeutils.CheckNetNsCleaned(oc, nodeName, netNsPath) in a
short poll using Gomega's o.Eventually (or wait.PollImmediate) so the test
retries until the NetNS is actually gone; e.g., poll the function that calls
nodeutils.CheckNetNsCleaned and assert the returned error is nil with a
reasonable timeout (e.g., ~1m) and short interval (e.g., ~2-5s) before calling
o.Expect(...).NotTo(o.HaveOccurred()), referencing the existing
nodeutils.CheckNetNsCleaned, oc, nodeName and netNsPath symbols.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/node/node_utils.go`:
- Around line 805-810: Capture the command output and error from
ExecOnNodeWithChroot (replace "_, err := ExecOnNodeWithChroot(...)" with "out,
err := ExecOnNodeWithChroot(...)"), then: if err == nil treat that as the file
still present and return an error; if err != nil inspect it—if it encodes an
exit status indicating "file missing" (exit code 1 or stderr like "No such file
or directory") then log success via framework.Logf and return nil, otherwise log
the full error and output with framework.Logf and return that error so real exec
failures are not masked; reference ExecOnNodeWithChroot, netNsPath, and
framework.Logf when making these changes.

---

Nitpick comments:
In `@test/extended/node/node_e2e/netns_cleanup.go`:
- Around line 117-119: Wrap the single call to nodeutils.CheckNetNsCleaned(oc,
nodeName, netNsPath) in a short poll using Gomega's o.Eventually (or
wait.PollImmediate) so the test retries until the NetNS is actually gone; e.g.,
poll the function that calls nodeutils.CheckNetNsCleaned and assert the returned
error is nil with a reasonable timeout (e.g., ~1m) and short interval (e.g.,
~2-5s) before calling o.Expect(...).NotTo(o.HaveOccurred()), referencing the
existing nodeutils.CheckNetNsCleaned, oc, nodeName and netNsPath symbols.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a57532f7-ba8e-42e3-9094-557d1b7c7533

📥 Commits

Reviewing files that changed from the base of the PR and between 07bfcca and 4cd2aca.

📒 Files selected for processing (3)
  • test/extended/node/README.md
  • test/extended/node/node_e2e/netns_cleanup.go
  • test/extended/node/node_utils.go
✅ Files skipped from review due to trivial changes (1)
  • test/extended/node/README.md

Comment thread test/extended/node/node_utils.go
@openshift-ci-robot
Copy link
Copy Markdown

@BhargaviGudi: No Jira issue with key OCP-56266 exists in the tracker at https://redhat.atlassian.net.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Summary by CodeRabbit

  • Tests
  • Added an E2E test that verifies network namespaces are cleaned up after pods are deleted (references related OCP issue).
  • Added test helpers to extract a pod’s network namespace from node logs and to confirm on-node cleanup.
  • Documentation
  • Updated test documentation to include the new network namespace cleanup test entry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/extended/node/node_e2e/netns_cleanup.go (1)

117-119: ⚡ Quick win

Consider polling the on-node NetNS check to avoid flakiness.

CheckNetNsCleaned performs a single test -e with no retry. Although the pod object reaching NotFound usually implies sandbox/network teardown has completed, NetNS removal on a slow node can lag slightly behind API-server pod deletion, making this assertion intermittently flaky. Wrapping the check in a short poll makes the test more robust.

♻️ Proposed polling wrapper
 	g.By("Verify that the NetNS file has been cleaned up on the node")
-	err = nodeutils.CheckNetNsCleaned(oc, nodeName, netNsPath)
-	o.Expect(err).NotTo(o.HaveOccurred(), "NetNS file was not cleaned up")
+	err = wait.PollUntilContextTimeout(ctx, 2*time.Second, 1*time.Minute, true, func(ctx context.Context) (bool, error) {
+		if cleanErr := nodeutils.CheckNetNsCleaned(oc, nodeName, netNsPath); cleanErr != nil {
+			e2e.Logf("NetNS not yet cleaned: %v", cleanErr)
+			return false, nil
+		}
+		return true, nil
+	})
+	o.Expect(err).NotTo(o.HaveOccurred(), "NetNS file was not cleaned up")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/node/node_e2e/netns_cleanup.go` around lines 117 - 119, Replace
the single one-off call to nodeutils.CheckNetNsCleaned with a short polling
retry so the on-node test -e check is retried until success or timeout; e.g.,
use a polling helper (wait.PollImmediate or o.Eventually) to call
nodeutils.CheckNetNsCleaned(oc, nodeName, netNsPath) repeatedly with a small
interval (e.g., 500ms) and a reasonable timeout (e.g., 20–30s), and fail the
test if the poll times out while still returning an error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/extended/node/node_e2e/netns_cleanup.go`:
- Around line 117-119: Replace the single one-off call to
nodeutils.CheckNetNsCleaned with a short polling retry so the on-node test -e
check is retried until success or timeout; e.g., use a polling helper
(wait.PollImmediate or o.Eventually) to call nodeutils.CheckNetNsCleaned(oc,
nodeName, netNsPath) repeatedly with a small interval (e.g., 500ms) and a
reasonable timeout (e.g., 20–30s), and fail the test if the poll times out while
still returning an error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 39136102-75cb-4671-875e-402c4b5df910

📥 Commits

Reviewing files that changed from the base of the PR and between 4cd2aca and 1212a5f.

📒 Files selected for processing (3)
  • test/extended/node/README.md
  • test/extended/node/node_e2e/netns_cleanup.go
  • test/extended/node/node_utils.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/extended/node/node_utils.go

@openshift-ci-robot
Copy link
Copy Markdown

@BhargaviGudi: No Jira issue with key OCP-56266 exists in the tracker at https://redhat.atlassian.net.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Summary by CodeRabbit

  • Tests
  • Added an E2E test that verifies network namespaces are cleaned up on nodes after pods are deleted (references related OCP issue).
  • Added test helpers to extract a pod’s network namespace from node diagnostics and to confirm on-node cleanup.
  • Test exercises pod lifecycle and validates node-side namespace removal.
  • Documentation
  • Updated test documentation to include the new network namespace cleanup test entry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 1, 2026
@BhargaviGudi BhargaviGudi changed the title WIP Migrate OCP-56266: verify kubelet/crio deletes netns when pod deleted WIP OCPNODE-4560: Migrate OCP-56266 verify kubelet/crio deletes netns when pod deleted Jun 1, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 1, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jun 1, 2026

@BhargaviGudi: This pull request references OCPNODE-4560 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Migrates testcase OCP-56266 from openshift-tests-private to origin.

What this test validates

This test verifies that kubelet/CRI-O properly clean up the network namespace file when a pod is deleted.

The test:

  • Creates a pod with proper security context
  • Waits for the pod to be ready
  • Retrieves the pod's network namespace path from CRI-O journal logs
  • Deletes the pod
  • Verifies that the NetNS file has been cleaned up from the node

Implementation details

Summary by CodeRabbit

  • Tests
  • Added an E2E test that verifies network namespaces are cleaned up on nodes after pods are deleted (references related OCP issue).
  • Added test helpers to extract a pod’s network namespace from node diagnostics and to confirm on-node cleanup.
  • Test exercises pod lifecycle and validates node-side namespace removal.
  • Documentation
  • Updated test documentation to include the new network namespace cleanup test entry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@BhargaviGudi
Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-trt
Copy link
Copy Markdown

openshift-trt Bot commented Jun 2, 2026

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New Test Risks for sha: bb67b91

Job Name New Test Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-fips High - "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" is a new test that was not present in all runs against the current commit, and also failed 1 time(s).
pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6 High - "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" is a new test that was not present in all runs against the current commit, and also failed 1 time(s).
pull-ci-openshift-origin-main-e2e-vsphere-ovn High - "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" is a new test that was not present in all runs against the current commit.
pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi High - "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" is a new test that was not present in all runs against the current commit.

New tests seen in this PR at sha: bb67b91

  • "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" [Total: 11, Pass: 9, Fail: 2, Flake: 0]

@BhargaviGudi
Copy link
Copy Markdown
Contributor Author

/test periodic-ci-openshift-origin-release-5.0-e2e-aws-ovn

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-trt
Copy link
Copy Markdown

openshift-trt Bot commented Jun 3, 2026

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New Test Risks for sha: 3e0ee2d

Job Name New Test Risk
pull-ci-openshift-origin-main-e2e-aws-ovn-microshift High - "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" is a new test that failed 1 time(s) against the current commit

New tests seen in this PR at sha: 3e0ee2d

  • "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" [Total: 6, Pass: 5, Fail: 1, Flake: 0]

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jun 4, 2026

@BhargaviGudi: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-trt
Copy link
Copy Markdown

openshift-trt Bot commented Jun 4, 2026

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New tests seen in this PR at sha: 89bd515

  • "[sig-node] [Jira:Node/Kubelet] Network namespace cleanup [OTP] kubelet/crio will delete netns when a pod is deleted [OCP-56266] [Suite:openshift/conformance/parallel]" [Total: 5, Pass: 5, Fail: 0, Flake: 0]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants