Skip to content

Installer script breaks when certain kernel modules are loaded before driver restart #202

@kitfoman

Description

@kitfoman

Certain kernel modules like irdma will cause the installer to fail on the openibd restart here . This is particularly problematic in certain environments where some mellanox NICs are used in Ethernet mode for regular traffic. When the restart fails with Function: generate_ofed_modules_blacklist Unloading ib_uverbs [FAILED] rmmod: ERROR: Module ib_uverbs is in use by: irdma [16-Jul-25_16:25:49] Command "/etc/init.d/openibd restart" failed with exit code: 1 , it will unclaim the ethernet NICs causing the node to lose network connectivity and leaving the node in an unrecoverable state.

I'm proposing on introducing a custom environment variable CUSTOM_UNLOAD_MODULES where a list of kernel that need to be unloaded before the restart can be passed to the script.

The Implementation can be similar to UNLOAD_STORAGE_MODULES where this list is appended to UNLOAD_MODULES and let openibd deal with it. See: https://github.com/Mellanox/doca-driver-build/blob/main/entrypoint.sh#L472

If this approach sounds reasonable, I have a PR ready. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions