Skip to content

PoC: auto-detect NetworkManager and use nmstatectl kernel mode#1455

Draft
qinqon wants to merge 1 commit intonmstate:mainfrom
qinqon:nmstate-kernel-mode
Draft

PoC: auto-detect NetworkManager and use nmstatectl kernel mode#1455
qinqon wants to merge 1 commit intonmstate:mainfrom
qinqon:nmstate-kernel-mode

Conversation

@qinqon
Copy link
Member

@qinqon qinqon commented Mar 13, 2026

Summary

This is a Proof of Concept — not intended to merge as-is.

Auto-detect NetworkManager presence at handler startup and transparently switch to nmstatectl kernel mode (-k flag) when NM is absent. This enables kubernetes-nmstate to run on nodes without NetworkManager (e.g., kind containers, minimal hosts).

  • Handler calls nm.Version() at startup; if NM is unavailable, sets kernelMode = true
  • nmstatectl show/apply use -k flag in kernel mode
  • Checkpoint/rollback/commit/probes are skipped (not supported without NM)
  • NNS reporting works via kernel mode — verified on a kind cluster
  • Liveness probe adapts to use -k when /tmp/kernel-mode flag file exists
  • E2E tests get feature Label() decorators for mode-based filtering (--label-filter)
  • New make test-e2e-handler-kernel target runs only kernel-supported tests
  • GitHub Actions workflow added for kernel mode CI on kind

Kernel mode limitations: no checkpoint/rollback, no DHCP, changes are volatile (lost on reboot).

Depends-on:

Verified on kind

  • 3-node kind cluster (no NetworkManager)
  • Handler auto-detects kernel mode at startup
  • NNS created for all 3 nodes with full network state
  • hostNetworkManagerVersion reports N/A (kernel mode)
  • All handler pods Running/Ready, liveness probes passing

Test plan

  • Unit tests pass (go test ./pkg/... ./controllers/...)
  • Operator controller tests pass (liveness probe assertions updated)
  • All packages compile
  • Manual verification on kind cluster — NNS working
  • E2E kernel mode tests on kind (make test-e2e-handler-kernel)
  • E2E NM mode tests unchanged (make test-e2e-handler)

🤖 Generated with Claude Code

@kubevirt-bot
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubevirt-bot
Copy link
Collaborator

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot kubevirt-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Mar 13, 2026
@kubevirt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign phoracek for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This Proof of Concept pull request introduces a significant enhancement to kubernetes-nmstate by enabling it to operate effectively in environments lacking NetworkManager. By automatically detecting NetworkManager's absence and transparently switching to nmstatectl's kernel mode, the handler can now manage network configurations on minimal hosts and containerized environments like kind clusters. This change broadens the applicability of kubernetes-nmstate, although it necessitates certain trade-offs, such as the unavailability of checkpointing, rollback, and DHCP functionalities, and the ephemeral nature of network changes in kernel mode.

Highlights

  • Automatic NetworkManager Detection: The handler now automatically detects the presence of NetworkManager at startup.
  • Transparent Kernel Mode Switching: If NetworkManager is not found, the handler transparently switches to using nmstatectl in kernel mode (with the -k flag).
  • Expanded Compatibility: This change enables kubernetes-nmstate to run on nodes without NetworkManager, such as kind containers or minimal hosts.
  • nmstatectl Command Adaptation: nmstatectl show and apply commands are modified to automatically include the -k flag when operating in kernel mode.
  • Skipped Unsupported Operations: Checkpointing, rollback, commit, and probes are skipped when the handler is running in kernel mode, as these functionalities are not supported without NetworkManager.
  • NNS Reporting in Kernel Mode: Node Network State (NNS) reporting has been verified to work correctly even when the handler is in kernel mode.
  • Dynamic Liveness Probe: The liveness probe dynamically adapts its command to use nmstatectl show -k if a /tmp/kernel-mode flag file exists, indicating kernel mode operation.
  • E2E Test Enhancements: E2E tests have been updated with feature Label() decorators, allowing for mode-based filtering of tests.
  • New Kernel Mode E2E Test Target: A new make test-e2e-handler-kernel target has been introduced to specifically run E2E tests that are supported in kernel mode.
  • CI for Kernel Mode: A GitHub Actions workflow has been added to enable continuous integration for kernel mode on kind clusters.
  • Kernel Mode Limitations: It is noted that kernel mode has limitations, including no checkpoint/rollback, no DHCP support, and network changes being volatile (lost on reboot).
Changelog
  • Makefile
    • Defined KERNEL_MODE_FEATURES for kernel-supported E2E tests.
    • Added test-e2e-handler-kernel target to run kernel-mode E2E tests using label filtering.
    • Included the new kernel-mode E2E test target in check-gen.
  • cmd/handler/main.go
    • Imported the nm package.
    • Implemented detectAndSetKernelMode to check for NetworkManager and activate kernel mode if it's missing.
    • Integrated detectAndSetKernelMode into the handler's environment setup.
    • Created a /tmp/kernel-mode flag file when operating in kernel mode.
  • controllers/handler/node_controller.go
    • Modified getDependencyVersions to report "N/A (kernel mode)" for NetworkManager version when in kernel mode.
  • controllers/operator/nmstate_controller_test.go
    • Updated liveness probe test assertions to reflect conditional nmstatectl show -k usage.
  • deploy/handler/operator.yaml
    • Adjusted the handler's liveness probe command to dynamically include the -k flag for nmstatectl show if /tmp/kernel-mode exists.
  • pkg/client/client.go
    • Implemented conditional logic in ApplyDesiredState to bypass checkpointing, rollback, and probes, directly applying state with nmstatectl -k when in kernel mode.
  • pkg/nmstatectl/nmstatectl.go
    • Introduced kernelMode state variable and accessor functions (SetKernelMode, IsKernelMode).
    • Modified ShowWithArgumentsAndOutputs and Show to automatically add the -k flag when in kernel mode.
    • Updated Set to use the -k flag and omit --no-commit and --timeout arguments in kernel mode.
    • Implemented early exit and logging for Commit and Rollback functions when in kernel mode, as these operations are not supported.
  • pkg/nmstatectl/nmstatectl_test.go
    • Extended test setup to manage kernelMode state.
    • Added new test cases (TestSetKernelMode, TestShowKernelMode, TestCommitKernelMode, TestRollbackKernelMode) to validate kernel mode behavior for Set, Show, Commit, and Rollback commands.
  • test/e2e/handler/bonding_default_interface_test.go
    • Labeled the test suite with "DHCP".
  • test/e2e/handler/default_bridged_network_test.go
    • Labeled the test suite with "DHCP".
  • test/e2e/handler/default_bridged_network_with_nmpolicy_test.go
    • Labeled the test suite with "DHCP".
  • test/e2e/handler/default_ovs_bridged_network_test.go
    • Labeled the test suite with "OVS".
  • test/e2e/handler/dns_test.go
    • Labeled the test suite with "DHCP".
  • test/e2e/handler/examples_test.go
    • Labeled the test suite with "Examples".
  • test/e2e/handler/lldp_with_nmpolicy_test.go
    • Labeled the test suite with "LLDP".
  • test/e2e/handler/metrics_test.go
    • Labeled the test suite with "Metrics".
  • test/e2e/handler/multiple_policies_for_same_node_test.go
    • Labeled the test suite with "MultiplePolicies".
  • test/e2e/handler/nnce_conditions_test.go
    • Labeled the test suite with "EnactmentCondition".
  • test/e2e/handler/nnce_desiredstate_test.go
    • Labeled the test suite with "EnactmentDesiredState".
  • test/e2e/handler/nncp_cleanup_test.go
    • Labeled the test suite with "PolicyCleanup".
  • test/e2e/handler/nncp_parallel_test.go
    • Labeled the test suite with "PolicyParallel".
  • test/e2e/handler/nns_dependencies_test.go
    • Labeled the test suite with "NNSDependencies".
  • test/e2e/handler/nns_filter_test.go
    • Labeled the test suite with "NNSFilter".
  • test/e2e/handler/nns_ovn.go
    • Labeled the test suite with "OVN".
  • test/e2e/handler/nns_update_timestamp_test.go
    • Labeled the test suite with "NNSTimestamp".
  • test/e2e/handler/node_selector_test.go
    • Labeled the test suite with "NodeSelector".
  • test/e2e/handler/nodes_test.go
    • Labeled the test suite with "Nodes".
  • test/e2e/handler/pending_checkpoint_test.go
    • Labeled the test suite with "Checkpoint".
  • test/e2e/handler/rollback_test.go
    • Labeled the test suite with "Rollback".
  • test/e2e/handler/simple_bridge_and_bond_test.go
    • Labeled the test suite with "DHCP".
  • test/e2e/handler/simple_ovs_bridge_and_bond_test.go
    • Labeled the test suite with "OVS".
  • test/e2e/handler/simple_ovs_bridge_test.go
    • Labeled the test suite with "OVS".
  • test/e2e/handler/simple_vlan_and_ip_test.go
    • Labeled the test suite with "VLAN".
  • test/e2e/handler/static_addr_and_route_test.go
    • Labeled the test suite with "StaticAddress".
  • test/e2e/handler/upgrade_test.go
    • Labeled the test suite with "Upgrade".
  • test/e2e/handler/user_guide_test.go
    • Labeled the test suite with "UserGuide".
  • test/e2e/handler/vrf_test.go
    • Labeled the test suite with "VRF".
  • test/e2e/handler/webhook_test.go
    • Labeled both test suites with "Webhook".
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/e2e-kernel-mode.yml
Activity
  • Unit tests for pkg and controllers packages have passed.
  • Operator controller tests, including updated liveness probe assertions, have passed.
  • All packages successfully compile.
  • Manual verification on a kind cluster confirmed NNS functionality.
  • E2E kernel mode tests on kind (make test-e2e-handler-kernel) are pending.
  • E2E NetworkManager mode tests (make test-e2e-handler) are pending.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@qinqon qinqon force-pushed the nmstate-kernel-mode branch from 3c80bc6 to 635900b Compare March 13, 2026 12:02
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a proof-of-concept for running kubernetes-nmstate in a 'kernel mode' when NetworkManager is not available. The changes include auto-detection of NetworkManager at handler startup, a global flag to enable kernel mode, and modifications to nmstatectl commands to use the -k flag and bypass checkpointing features. The liveness probe is also updated to work in both modes. The changes are logical and well-contained for a PoC. I have a few suggestions to improve maintainability and robustness, mainly around the use of a hardcoded flag file path, global state management, and the complexity of the liveness probe command.

detectAndSetKernelMode()

if nmstatectl.IsKernelMode() {
if err := file.Touch("/tmp/kernel-mode"); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The file path /tmp/kernel-mode is hardcoded. To improve maintainability and avoid magic strings, it would be better to define this path as a constant in a shared package (e.g., in pkg/environment or a new pkg/constants). This path is also used in the liveness probe definition in deploy/handler/operator.yaml and related tests, so a constant would ensure consistency.

- bash
- -c
- "nmstatectl show {{ .HandlerReadinessProbeExtraArg }} 2>&1"
- "if [ -f /tmp/kernel-mode ]; then nmstatectl show -k {{ .HandlerReadinessProbeExtraArg }} 2>&1; else nmstatectl show {{ .HandlerReadinessProbeExtraArg }} 2>&1; fi"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The shell command for the liveness probe is complex and is duplicated in tests, which can be brittle and hard to maintain.

if [ -f /tmp/kernel-mode ]; then nmstatectl show -k {{ .HandlerReadinessProbeExtraArg }} 2>&1; else nmstatectl show {{ .HandlerReadinessProbeExtraArg }} 2>&1; fi

Consider adding a small wrapper script to the container image (e.g., /usr/bin/liveness-probe.sh) that encapsulates this logic. The liveness probe would then simply execute this script. This would make the YAML manifest cleaner and centralize the probe logic.

Comment on lines +37 to +41
var (
debugMode bool
kernelMode bool
log = logf.Log.WithName("nmstatectl")
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of global variables kernelMode and debugMode makes this package stateful, which can lead to issues with concurrency and makes testing more complex (requiring state manipulation with defer). While this might be acceptable for a PoC where it's set once at startup, for a production-ready feature, it would be more robust to pass configuration through a context or a configuration struct to the functions in this package. This would make the functions stateless and easier to reason about.

@qinqon qinqon force-pushed the nmstate-kernel-mode branch 5 times, most recently from a70b99c to 191a8e0 Compare March 13, 2026 12:43
@qinqon qinqon force-pushed the nmstate-kernel-mode branch 3 times, most recently from b758985 to 3f5bdb5 Compare March 16, 2026 11:06
@qinqon qinqon force-pushed the nmstate-kernel-mode branch 5 times, most recently from b379d69 to e4c4a50 Compare March 16, 2026 11:52
@qinqon qinqon force-pushed the nmstate-kernel-mode branch from e4c4a50 to 034c590 Compare March 16, 2026 12:02
@qinqon qinqon force-pushed the nmstate-kernel-mode branch 11 times, most recently from d97feb1 to 6d0a5b3 Compare March 23, 2026 13:11
@qinqon qinqon force-pushed the nmstate-kernel-mode branch 2 times, most recently from fcf4e98 to 006f178 Compare March 23, 2026 14:10
When NetworkManager is not available (e.g., kind containers, minimal hosts),
the handler now auto-detects this at startup and transparently switches to
nmstatectl kernel mode (-k flag). This affects both applying network config
(NNCP) and reporting state (NNS).

Key changes:
- Add kernelMode flag to nmstatectl package with -k flag for show/apply
- Skip checkpoint/rollback/commit/probes in kernel mode (not supported)
- Auto-detect NM absence at handler startup via nm.Version()
- Guard NM version reporting in node controller
- Update liveness probe to use -k flag when kernel-mode file exists
- Add init container to ensure dbus socket path exists on nodes without dbus
- Add feature Labels to all e2e handler tests for mode-based filtering
- Make BeforeSuite work in kernel mode (no DHCP reset)
- Add test-e2e-handler-kernel Makefile target with label filter
- Add GitHub Actions CI workflow for kernel mode e2e on kind

Kernel mode limitations: no checkpoint/rollback, no DHCP, volatile changes,
cannot create interfaces via nmstatectl apply -k (nispor plugin limitation).

Signed-off-by: Enrique Llorente <ellorent@redhat.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Enrique Llorente <ellorent@redhat.com>
@qinqon qinqon force-pushed the nmstate-kernel-mode branch from 006f178 to f17650e Compare March 23, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants