Skip to content

Fix thick plugin STATUS and GC handling for plugin-level commands#1480

Open
jkary wants to merge 2 commits intok8snetworkplumbingwg:masterfrom
jkary:fix-status-gc-no-pod-context
Open

Fix thick plugin STATUS and GC handling for plugin-level commands#1480
jkary wants to merge 2 commits intok8snetworkplumbingwg:masterfrom
jkary:fix-status-gc-no-pod-context

Conversation

@jkary
Copy link

@jkary jkary commented Mar 5, 2026

STATUS and GC are plugin-level commands with no pod context per the
CNI 1.1.0 spec. The thick plugin daemon incorrectly required
CNI_CONTAINERID, CNI_NETNS, and K8S_POD_NAME/K8S_POD_NAMESPACE for
these commands, causing failures when invoked by kubelet.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the Multus thick plugin daemon failed to correctly process CNI STATUS and GC commands due to an erroneous requirement for pod-specific context. By aligning with the CNI 1.1.0 specification, these commands are now properly recognized as plugin-level operations, resolving failures encountered by kubelet. Additionally, the daemon now incorporates a graceful shutdown mechanism, enhancing its resilience and operational stability. The changes also include updates to build processes and CNI version handling to improve compatibility and maintainability.

Highlights

  • CNI STATUS and GC Command Fix: Fixed the handling of CNI STATUS and GC commands for thick plugins, which previously incorrectly required pod context (CNI_CONTAINERID, CNI_NETNS, K8S_POD_NAME/NAMESPACE). These commands are now treated as plugin-level operations as per CNI 1.1.0 specification, resolving failures encountered by kubelet.
  • Graceful Shutdown Implementation: Introduced graceful shutdown logic for the Multus daemon, allowing a configurable SigTermCancelAfter duration (10 seconds) before forced termination, and increased terminationGracePeriodSeconds in daemonset configurations from 10 to 30 seconds to accommodate this.
  • Updated CNI Version Handling: Updated CNI version parsing in the thin_entrypoint to correctly handle CNI versions greater than or equal to 1.0.0 for conflist generation, improving compatibility with newer CNI specifications.
  • Refactored Dockerfiles and Entrypoint: Refactored Dockerfiles for MicroShift and OpenShift, consolidating build logic and introducing a new images/entrypoint.sh script for more flexible configuration generation.
Changelog
  • .ci-operator.yaml
    • Added a build root image configuration for OpenShift CI.
  • .snyk
    • Added global exclusion rules for test files and vendor directories to Snyk scans.
  • Dockerfile.microshift
    • Added a new Dockerfile specifically for building Multus for MicroShift, using RHEL 9 and Go 1.24.
  • Dockerfile.openshift
    • Added a new Dockerfile for building Multus for OpenShift, supporting both RHEL 9 and RHEL 8 with Go 1.24.
  • Makefile
    • Appended a newline character to the 'test' target.
  • OWNERS
    • Added a new OWNERS file specifying reviewers, approvers, and component/subcomponent for the repository.
  • cmd/multus-daemon/main.go
    • Imported the 'time' package.
    • Introduced 'SigTermCancelAfter' for graceful shutdown.
    • Added a 'sigTermCtx' and 'isInGracefulShutdownMode' function.
    • Updated 'startMultusDaemon' and 'NewCNIServer' to incorporate graceful shutdown logic.
  • cmd/thin_entrypoint/main.go
    • Imported 'cniversion'.
    • Updated the CNI version check in 'createMultusConfig' to use 'GreaterThanOrEqualTo' for CNI versions 1.0.0 and above.
  • cmd/thin_entrypoint/main_test.go
    • Added a new test case for 'createMultusConfig' with CNI version 1.1.0.
    • Added new test cases to verify propagation of delegate STATUS errors and single plugin delegate GC.
  • deployments/multus-daemonset-crio.yml
    • Increased 'terminationGracePeriodSeconds' from 10 to 30.
  • deployments/multus-daemonset-thick.yml
    • Increased 'terminationGracePeriodSeconds' from 10 to 30.
  • deployments/multus-daemonset.yml
    • Increased 'terminationGracePeriodSeconds' from 10 to 30.
  • go.mod
    • Updated the Go language version requirement from '1.24' to '1.24.0'.
  • images/Dockerfile.openshift
    • Removed the old Dockerfile for OpenShift.
  • images/entrypoint.sh
    • Added a new bash script to serve as the entrypoint for Multus CNI, handling configuration generation and kubeconfig creation.
  • pkg/kubeletclient/kubeletclient_test.go
    • Adjusted formatting for the 'PoolName' field in a test struct.
  • pkg/multus/multus.go
    • Imported 'stderrors' and 'cniversion'.
    • Added 'confStatus' and 'conflistStatus' functions for CNI STATUS command handling.
    • Updated 'CmdAdd' to use 'stderrors.Is' for 'errPodNotFound'.
    • Refactored 'CmdStatus' and 'CmdGC' to correctly handle single plugin delegates and CNI errors.
  • pkg/multus/multus_cni100_test.go
    • Imported 'errors', 'cnitypes', and 'types'.
    • Added new test cases for 'CmdAdd' when a pod is not found, delegate STATUS error propagation, and single plugin delegate GC.
  • pkg/netutils/netutils.go
    • Imported 'cniversion'.
    • Introduced 'isSupportedGatewayResultVersion' to check CNI versions.
    • Updated 'deleteDefaultGWResult' and 'addDefaultGWResult' to use this new version check.
  • pkg/netutils/netutils_test.go
    • Added test cases to verify gateway result updates for CNI version 1.1.0 and to confirm rejection of unsupported pre-1.0.0 CNI versions.
  • pkg/server/api/api.go
    • Imported 'cnitypes'.
    • Defined 'MultusReadyAPIEndpoint'.
    • Updated 'DoCNI' to unmarshal CNI errors from HTTP responses.
    • Changed 'WaitUntilAPIReady' to poll the new 'MultusReadyAPIEndpoint'.
  • pkg/server/api/shim.go
    • Imported 'stderrors'.
    • Updated 'postRequest' to handle CNI errors returned from 'DoCNI'.
  • pkg/server/server.go
    • Imported 'errors'.
    • Added 'cmdDelegateStatus' to handle delegate STATUS requests.
    • Updated 'NewCNIServer' and 'newCNIServer' to accept a graceful shutdown mode function.
    • Introduced 'writeCNIErrorResponse' and 'wrapCNIRequestError' for consistent error handling.
    • Modified 'handleCNIRequest', 'handleDelegateRequest', 'extractCniData', 'cmdGC', and 'cmdStatus' to correctly process plugin-level commands without requiring pod context.
  • pkg/server/thick_cni_test.go
    • Added a new test context to verify STATUS and GC commands succeed without pod context.
    • Updated 'startCNIServer' to pass a mock 'isInGracefulShutdownMode' function.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an issue where the thick plugin daemon required pod context for plugin-level CNI commands like STATUS and GC, and also includes changes for graceful shutdown. However, it introduces critical security vulnerabilities in the new entrypoint script for the Multus CNI container. Specifically, the script is vulnerable to JSON injection via the CNI_VERSION parameter, path traversal leading to arbitrary file renames on the host via the MASTER_PLUGIN parameter, and potential configuration corruption due to insecure shell scripting and improper JSON escaping. These issues are particularly critical given that the Multus container typically runs with high privileges and host filesystem access. It is recommended to use JSON-aware tools for configuration generation and to strictly validate all command-line inputs to address these security concerns. Additionally, a bug was identified in the new STATUS handling logic, and there are opportunities for cleanup in the Dockerfiles and shell scripts.


MASTER_PLUGIN_LOCATION=$MULTUS_AUTOCONF_DIR/$MASTER_PLUGIN
if [ "$FORCE_CNI_VERSION" == true ]; then
MASTER_PLUGIN_JSON="$(cat $MASTER_PLUGIN_LOCATION | sed -e "s/\"cniVersion.*/\"cniVersion\": \"$CNI_VERSION\",/g")"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The CNI_VERSION variable, which is populated from the --cni-version command-line argument, is used directly in a sed command without any sanitization or validation. An attacker who can control the arguments to the script can inject arbitrary JSON keys and values into the generated CNI configuration. This could be used to inject a malicious CNI plugin or alter the behavior of existing ones, potentially leading to a compromise of the node's networking stack.

Recommendation: Validate that CNI_VERSION matches an expected version format (e.g., ^[0-9]+\.[0-9]+\.[0-9]+$) before using it in the sed command. Alternatively, use a JSON-aware tool like jq or a Python script to modify the JSON content safely.


# If we're not performing the cleanup on exit, we can safely rename the config file.
if [ "$RENAME_SOURCE_CONFIG_FILE" == true ]; then
mv ${MULTUS_AUTOCONF_DIR}/${MASTER_PLUGIN} ${MULTUS_AUTOCONF_DIR}/${MASTER_PLUGIN}.old

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The MASTER_PLUGIN variable, which can be controlled via the --multus-master-cni-file-name command-line argument, is used to construct a file path for a mv command without validation. Since the container typically runs as privileged with host mounts, an attacker could use path traversal sequences (e.g., ../../etc/shadow) to rename arbitrary files on the host filesystem, potentially leading to a full system compromise or Denial of Service.

Recommendation: Validate that MASTER_PLUGIN is a simple filename and does not contain any path traversal characters like / or ...

Comment on lines +379 to +381
if gt, _ := cniversion.GreaterThanOrEqualTo(confList.CNIVersion, "1.1.0"); !gt {
logging.Debugf("conflistStatus: skipping STATUS for network list %q cniVersion %q (< 1.1.0)", confList.Name, confList.CNIVersion)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This if block logs that it is skipping the STATUS call for older CNI versions, but it doesn't actually skip the execution. It should return nil to prevent cniNet.GetStatusNetworkList from being called, similar to how it's handled in confStatus.

Suggested change
if gt, _ := cniversion.GreaterThanOrEqualTo(confList.CNIVersion, "1.1.0"); !gt {
logging.Debugf("conflistStatus: skipping STATUS for network list %q cniVersion %q (< 1.1.0)", confList.Name, confList.CNIVersion)
}
if gt, _ := cniversion.GreaterThanOrEqualTo(confList.CNIVersion, "1.1.0"); !gt {
logging.Debugf("conflistStatus: skipping STATUS for network list %q cniVersion %q (< 1.1.0)", confList.Name, confList.CNIVersion)
return nil
}

EOF
)
tmpfile=$(mktemp)
echo $CONF > $tmpfile

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Multiple variables, including $CONF on line 454, are used without double quotes in shell commands. This causes the shell to perform word splitting and globbing on the variable's content. If the generated JSON configuration contains shell metacharacters (like *), they will be expanded to filenames in the current directory, leading to a corrupted configuration file. It also strips newlines and extra whitespace, which can result in invalid JSON.

Recommendation: Always wrap shell variables in double quotes when they are used in commands, especially when they contain data that should be preserved literally. For example, change line 454 to echo "$CONF" > "$tmpfile".

Comment on lines +433 to +451
CONF=$(cat <<-EOF
{
$CNI_VERSION_STRING
"name": "$MASTER_PLUGIN_NET_NAME",
"type": "multus",
$NESTED_CAPABILITIES_STRING
$ISOLATION_STRING
$GLOBAL_NAMESPACES_STRING
$LOG_TO_STDERR_STRING
$LOG_LEVEL_STRING
$LOG_FILE_STRING
$ADDITIONAL_BIN_DIR_STRING
$READINESS_INDICATOR_FILE_STRING
"kubeconfig": "$MULTUS_KUBECONFIG_FILE_HOST",
"delegates": [
$MASTER_PLUGIN_JSON
]
}
EOF

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Values extracted from external CNI configuration files (such as MASTER_PLUGIN_NET_NAME on line 393) are embedded directly into a JSON template using a heredoc without proper JSON escaping. If these values contain double quotes or other characters that are special in JSON, the resulting 00-multus.conf file will be malformed. This can cause the Multus daemon to fail to start or fail to process networking requests for pods.

Recommendation: Use a JSON-aware tool to generate the configuration file instead of a shell heredoc. If shell must be used, ensure all variables are properly escaped for JSON using a tool like jq or a Python one-liner.

Comment on lines +8 to +9
RUN ./hack/build-go.sh && \
cd /usr/src/multus-cni/bin

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The cd /usr/src/multus-cni/bin command in this RUN instruction has no effect on subsequent instructions. Each RUN command executes in its own shell, and the working directory is reset for the next one based on the WORKDIR directive. This can be removed to avoid confusion. The same issue is present on lines 18-19.

RUN ./hack/build-go.sh

Comment on lines +133 to +135
--cni-bin-dir)
CNI_BIN_DIR=$VALUE
;;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This case for --cni-bin-dir is a duplicate of the one on lines 127-129. This redundant block should be removed.

Comment on lines +67 to +72
isInGracefulShutdownMode := func() bool {
if sigTermCtx.Err() == nil {
return false
}
return true
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function can be simplified to a single line for better readability.

	isInGracefulShutdownMode := func() bool { return sigTermCtx.Err() != nil }

STATUS and GC are plugin-level commands with no pod context per the
CNI 1.1.0 spec. The thick plugin daemon incorrectly required
CNI_CONTAINERID, CNI_NETNS, and K8S_POD_NAME/K8S_POD_NAMESPACE for
these commands, causing failures when invoked by kubelet.

Signed-off-by: Jason Kary <jkary@redhat.com>
@jkary jkary force-pushed the fix-status-gc-no-pod-context branch from d351be1 to 571a3ce Compare March 5, 2026 16:23
@bpickard22
Copy link
Collaborator

@jkary thanks for the contribution! The bot has left a pretty good review, take a look and address that and then I will take a look again

@bpickard22 bpickard22 self-assigned this Mar 11, 2026
The version check logged that STATUS would be skipped for older CNI
versions but did not actually return, causing GetStatusNetworkList to
be called regardless. Add the missing return nil.
@jkary
Copy link
Author

jkary commented Mar 11, 2026

@bpickard22 thanks for taking a look! I went through the Gemini comments and it looks like the bot reviewed the full branch diff against main rather than just this PR. Most of the comments are about files that aren't changed in this PR at all, like images/entrypoint.sh, Dockerfile.openshift, and cmd/multus-daemon/main.go.

This PR only touches pkg/server/server.go and pkg/server/thick_cni_test.go, so those comments don't apply here.

One thing the bot did catch that is worth fixing is a bug in pkg/multus/multus.go where conflistStatus logs that it will skip STATUS for older CNI versions but doesn't actually return, so it falls through and calls GetStatusNetworkList anyway. I pushed a fix for that.

Copy link
Collaborator

@pliurh pliurh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the nits, it looks good to me.

var err error
K8sClient = fakeK8sClient()
// Touch the default network file.
os.OpenFile(configPath, os.O_RDONLY|os.O_CREATE, 0755)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return values are ignored. We shall ensure no error returned by this call.

}
}

result, err := s.HandleCNIRequest(cmdType, k8sArgs, cniCmdArgs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider documenting that k8sArgs may be nil for STATUS/GC in the HandleCNIRequest() signature, or adding a nil guard in it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants