NFD tests are developed for the purpose of testing the deployment of the Node Feature Discovery (NFD) operator and its respective NodeFeatureDiscovery Custom Resource instance to automatically detect and label hardware and software capabilities of cluster nodes.
The Node Feature Discovery Operator manages the deployment of out-of-tree kernel modules and associated device plug-ins in Kubernetes. It discovers hardware features available on each node in a Kubernetes cluster, and advertises those features using node labels.
- Regular cluster with multiple master nodes (VMs or BMs) and workers (VMs or BMs)/SNO
- Public Clouds Cluster (AWS, GCP and Azure)
- On Premise Cluster
| Name | Description |
|---|---|
| features | Tests related to NFD operator deployment and feature discovery |
| 2upgrade | Tests related to NFD operator upgrade functionality |
Notes:
2upgradename is used to assure that ginkgo runs that before features.featurescontains the main NFD functionality tests including label discovery, pod status checks, and node feature detection.
- Utilities to obtain various objects like pod status, logs, restart count, node feature labels, CPU features, and resource counts.
- Helpers for waiting for various states of pods, nodes, and label discovery completion.
- Helper for setting CPU feature configurations including blacklist/whitelist management and custom NFD worker configurations.
- Configuration management that captures and processes environment variables used for NFD test execution.
- Helpers for cleaning up NFD-specific labels from cluster nodes and custom resources after test execution.
custom_resources.go: Comprehensive NFD custom resource cleanup with finalizer handling for operator uninstallation.feature_labels.go: Node label cleanup utilities for test isolation.
- Common utilities for searching and string manipulation operations.
- Constants and variables for NFD helper functions including valid pod name lists and container definitions.
- Configuration constants including default NFD worker configuration templates.
Parameters for the script are controlled by the following environment variables:
ECO_HWACCEL_NFD_CR_IMAGE: Custom NFD container image to use for NodeFeatureDiscovery CR. If not specified, the default operator image is used - optionalECO_HWACCEL_NFD_CATALOG_SOURCE: Custom catalog source to be used for NFD operator installation. If not specified, the default "certified-operators" catalog is used - optionalECO_HWACCEL_NFD_AWS_TESTS: Enable AWS-specific tests including day2 worker node addition. Set to "true" when running NFD tests against AWS clusters - optionalECO_HWACCEL_NFD_CPU_FLAGS_HELPER_IMAGE: Container image used for CPU flags helper tests. Required when running AWS day2 worker tests - required for AWS tests
ECO_HWACCEL_NFD_SUBSCRIPTION_NAME: Name of subscription used to deploy the NFD operator - required for upgrade testsECO_HWACCEL_NFD_CUSTOM_NFD_CATALOG_SOURCE: Custom catalog source name used for performing operator upgrades - required for upgrade testsECO_HWACCEL_NFD_UPGRADE_TARGET_VERSION: Expected version of the operator after upgrade completion - required for upgrade tests
ECO_TEST_LABELS: ginkgo query passed to the label-filter option for including/excluding tests - optionalECO_VERBOSE_SCRIPT: prints verbose script information when executing the script - optionalECO_TEST_VERBOSE: executes ginkgo with verbose test output - optionalECO_TEST_TRACE: includes full stack trace from ginkgo tests when a failure occurs - optionalECO_TEST_FEATURES: list of features to be tested. Should include "nfd" for NFD tests - required
In case the required inputs are not set, the tests are skipped.
$ export KUBECONFIG=/path/to/kubeconfig
$ export ECO_DUMP_FAILED_TESTS=true
$ export ECO_REPORTS_DUMP_DIR=/tmp/eco-gotests-logs-dir
$ export ECO_TEST_FEATURES="nfd"
$ export ECO_TEST_LABELS='nfd,discovery-of-labels'
$ export ECO_VERBOSE_LEVEL=100
$ export ECO_HWACCEL_NFD_CATALOG_SOURCE="certified-operators"
$ export ECO_HWACCEL_NFD_CR_IMAGE="registry.redhat.io/openshift4/ose-node-feature-discovery:latest"
$ make run-tests$ export KUBECONFIG=/path/to/kubeconfig
$ export ECO_TEST_FEATURES="nfd"
$ export ECO_TEST_LABELS='nfd'
$ export ECO_HWACCEL_NFD_AWS_TESTS=true
$ export ECO_HWACCEL_NFD_CPU_FLAGS_HELPER_IMAGE="<cpu-flags-helper image specific to cluster architecture>"
$ export ECO_HWACCEL_NFD_CATALOG_SOURCE="certified-operators"
$ make run-tests$ export KUBECONFIG=/path/to/kubeconfig
$ export ECO_TEST_FEATURES="nfd"
$ export ECO_TEST_LABELS='nfd_upgrade'
$ export ECO_HWACCEL_NFD_SUBSCRIPTION_NAME='nfd-subscription'
$ export ECO_HWACCEL_NFD_UPGRADE_TARGET_VERSION='4.17.0'
$ export ECO_HWACCEL_NFD_CUSTOM_NFD_CATALOG_SOURCE='custom-nfd-catalog'
$ export ECO_HWACCEL_NFD_CATALOG_SOURCE="certified-operators"
$ make run-testsThe NFD test suite covers the following scenarios:
- Pod Status Verification: Ensures all NFD pods (controller-manager, master, worker, topology) are running correctly
- Log Analysis: Checks NFD pod logs for error messages and exceptions
- Restart Count Monitoring: Verifies that NFD pods have zero restart counts
- Feature Label Discovery: Tests detection and labeling of CPU features, kernel configurations, and hardware capabilities
- NUMA Detection: Validates NUMA topology detection and labeling (when supported)
- Blacklist/Whitelist Functionality: Tests CPU feature filtering using blacklist and whitelist configurations
- Day2 Worker Addition: Tests NFD functionality when adding new worker nodes to AWS clusters
- Operator Upgrade: Tests upgrading NFD operator to newer versions
The NFD tests now use a modern, clean approach with generic operator installer and standalone custom resource utilities:
installConfig := nfdhelpers.GetStandardNFDConfig(apiClient)
installer := deploy.NewOperatorInstaller(installConfig)
err := installer.Install()
// Wait for operator readiness
ready, err := installer.IsReady(5 * time.Minute)
/
crUtils := deploy.NewNFDCRUtils(apiClient, "openshift-nfd")
nfdConfig := deploy.NFDCRConfig{
EnableTopology: true,
Image: "registry.redhat.io/openshift4/ose-node-feature-discovery:latest",
}
err = crUtils.DeployNFDCR("nfd-instance", nfdConfig)
// Wait for CR readiness
crReady, err := crUtils.IsNFDCRReady("nfd-instance", 3*time.Minute)// Use helper with custom catalog source
options := &nfdhelpers.NFDInstallConfigOptions{
CatalogSource: nfdhelpers.StringPtr("custom-catalog"),
}
installConfig := nfdhelpers.GetDefaultNFDInstallConfig(apiClient, options)
installer := deploy.NewOperatorInstaller(installConfig)// Deploy NFD CR with custom worker configuration for CPU features blacklist/whitelist
workerConfig := `
sources:
cpu:
cpuid:
attributeBlacklist: ["BMI1", "BMI2"]
attributeWhitelist: ["SSE", "SSE2"]
`
err = crUtils.DeployNFDCRWithWorkerConfig("nfd-custom", nfdConfig, workerConfig)// Individual CR deletion (for tests)
err := nfdCRUtils.DeleteNFDCR("nfd-instance")
// Complete uninstallation with automatic CR cleanup
uninstallConfig := nfdhelpers.GetDefaultNFDUninstallConfig(
apiClient,
"nfd-operator-group",
"nfd-subscription")
uninstaller := deploy.NewOperatorUninstaller(uninstallConfig)
err = uninstaller.Uninstall()
// ✅ Automatically deletes CRs first (including finalizer handling)
// ✅ Then removes operator, subscription, operator group// Direct cleanup of all NFD CRs (useful for tests)
err := nfddelete.AllNFDCustomResources(apiClient, "openshift-nfd")
// Or cleanup specific CRs
err := nfddelete.AllNFDCustomResources(apiClient, "openshift-nfd",
"nfd-instance", "nfd-instance-custom")- Clean Separation: Operator installation is separate from custom resource management
- Reusable Components: The generic installer works for NFD, KMM, AMD GPU, and other operators
- Standalone Utilities: NFD CR utilities can be used independently of operator installation
- Better Error Handling: Comprehensive validation and improved logging for troubleshooting
- Simplified Usage: Direct, intuitive API without complex factory patterns
- Configuration Helpers: NFD-specific helper functions eliminate repetitive configuration setup
- Reduced Duplication: Common NFD patterns are centralized in reusable helper functions
NFD tests verify that hardware features are properly detected and labeled on cluster nodes. The tests support both standard feature detection and custom configurations through worker config modifications. Special attention is given to CPU feature detection, topology awareness, and proper cleanup of labels after test execution.
The new deployment framework provides better reliability and easier maintenance while supporting all existing test scenarios and configurations.