Skip to content

Revert "daemon: fix kernel argument management and logging errors"#147

Merged
e0ne merged 1 commit intoMellanox:network-operator-26.1.xfrom
rollandf:revert-grub
Feb 2, 2026
Merged

Revert "daemon: fix kernel argument management and logging errors"#147
e0ne merged 1 commit intoMellanox:network-operator-26.1.xfrom
rollandf:revert-grub

Conversation

@rollandf
Copy link
Member

@rollandf rollandf commented Feb 2, 2026

This reverts commit d19331c.

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

This reverts commit d19331c.

Signed-off-by: Fred Rolland <frolland@nvidia.com>
@greptile-apps
Copy link

greptile-apps bot commented Feb 2, 2026

Greptile Overview

Greptile Summary

This PR reverts a prior change that adjusted Ubuntu GRUB kernel-argument handling and a logging statement in the daemon.

Key impact areas:

  • bindata/scripts/kargs.sh: Ubuntu GRUB edits are again performed via a host-level /tmp/grub intermediate copy before update-grub.
  • pkg/daemon/daemon.go: checkOnNodeStateChange logging was reverted, but the current code logs dn.mainPlugin.Name (not Name()), which is inconsistent with the rest of the file and likely breaks logging/compilation depending on the plugin type.

Confidence Score: 2/5

  • This PR has a high chance of introducing a build/runtime issue due to a broken logging reference, and it reintroduces a race-prone tmp-file behavior.
  • Confidence is reduced because dn.mainPlugin.Name appears to be a method reference rather than an invoked method, which can fail compilation or produce unusable logs. The revert in kargs.sh also brings back a fixed /tmp/grub temp file that can collide across runs and is not cleaned up.
  • pkg/daemon/daemon.go, bindata/scripts/kargs.sh

Important Files Changed

Filename Overview
bindata/scripts/kargs.sh Reverts to using a fixed /tmp/grub copy for Ubuntu GRUB edits; this reintroduces cross-run interference and leaves a stale tmp file behind.
pkg/daemon/daemon.go Reverts a logging change in checkOnNodeStateChange but leaves a likely broken reference to dn.mainPlugin.Name (should be Name()).

Sequence Diagram

sequenceDiagram
    participant D as sriov-config-daemon
    participant P as mainPlugin (VendorPlugin)
    participant H as Host (/host)
    participant G as GRUB config (/etc/default/grub)

    D->>P: OnNodeStateChange(desiredNodeState)
    alt plugin returns error
        D-->>D: log error (mainPluginName)
    else plugin returns requirements
        D-->>D: aggregate drain/reboot requirement
    end

    opt Ubuntu kernel args update
        D->>H: chroot /host read /etc/os-release
        D->>G: copy to /tmp/grub
        D->>G: edit /tmp/grub (sed/grep)
        D->>G: copy /tmp/grub back to /etc/default/grub
        D->>H: chroot /host update-grub
    end

Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 298 to 301
reqDrain, reqReboot, err := dn.mainPlugin.OnNodeStateChange(desiredNodeState)
if err != nil {
funcLog.Error(err, "OnNodeStateChange plugin error", "mainPluginName", dn.mainPlugin.Name())
funcLog.Error(err, "OnNodeStateChange plugin error", "mainPluginName", dn.mainPlugin.Name)
return false, false, err
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] dn.mainPlugin.Name is a method, not a field

plugin.VendorPlugin is used elsewhere as an interface with a Name() method (see line 304). Logging dn.mainPlugin.Name will print a function value (or fail compilation depending on the concrete type), and it’s inconsistent with the other log entries.

Suggested change
reqDrain, reqReboot, err := dn.mainPlugin.OnNodeStateChange(desiredNodeState)
if err != nil {
funcLog.Error(err, "OnNodeStateChange plugin error", "mainPluginName", dn.mainPlugin.Name())
funcLog.Error(err, "OnNodeStateChange plugin error", "mainPluginName", dn.mainPlugin.Name)
return false, false, err
funcLog.Error(err, "OnNodeStateChange plugin error", "mainPluginName", dn.mainPlugin.Name())

Comment on lines 35 to +38
if ${IS_OS_UBUNTU} ; then
grub_config="/etc/default/grub"
# Operate on the copy of the file
# Use host's /tmp because container root might be read-only
tmp_grub_path=$(chroot "$chroot_path" mktemp)
tmp_grub_in_container="${chroot_path}${tmp_grub_path}"
cp ${chroot_path}/${grub_config} "${tmp_grub_in_container}"
cp ${chroot_path}/${grub_config} /tmp/grub
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] /tmp/grub is a fixed path and isn’t cleaned up

On Ubuntu this script copies ${chroot_path}/${grub_config} to a host-level /tmp/grub and later writes it back, but it never deletes it and it’s not unique per invocation. This can cause cross-run interference (concurrent daemons/nodes/containers writing the same file) and can leave stale content behind if the script exits early.

@e0ne e0ne merged commit a713642 into Mellanox:network-operator-26.1.x Feb 2, 2026
9 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants