nvidia-cuda-validator Pod in nvidia-gpu-operator Namespace Stuck in Init:CrashLoopBackOff on OpenShift 4.18.22

_**Important Note:  NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case [here](https://enterprise-support.nvidia.com/s/create-case)**._

**Describe the bug**
nvidia-cuda-validator pod in nvidia-gpu-operator is in Init:CrashLoopBackOff

% oc logs nvidia-cuda-validator-4bnbm -c cuda-validation -n nvidia-gpu-operator
Failed to allocate device vector A (error code initialization error)!
[Vector addition of 50000 elements]


**To Reproduce**
Install Nvidia GPU Operator 25.10.0 from Operator Hub on Redhat openshift container platform 4.18.22

**Expected behavior**
Install the NVIDIA GPU Operator v25.10.0 Successfully, with all pods in the nvidia-gpu-operator namespace in a Running state.

**Environment (please provide the following information):**
Bare metal OpenShift cluster 4.18.22
1 GPU nodes with L40S GPU cards
Node Feature Discovery operator installed 4.18.0
Nvidia gpu Operator installed 25.10.0



**Information to [attach](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/)** (optional if deemed irrelevant)

 - [ ] kubernetes pods status: `kubectl get pods -n OPERATOR_NAMESPACE`
 - [ ] kubernetes daemonset status: `kubectl get ds -n OPERATOR_NAMESPACE`
 - [ ] If a pod/ds is in an error state or pending state `kubectl describe pod -n OPERATOR_NAMESPACE POD_NAME`
 - [ ] If a pod/ds is in an error state or pending state `kubectl logs -n OPERATOR_NAMESPACE POD_NAME --all-containers`
 - [ ] Output from running `nvidia-smi` from the driver container: `kubectl exec DRIVER_POD_NAME -n OPERATOR_NAMESPACE -c nvidia-driver-ctr -- nvidia-smi`
 - [ ] containerd logs `journalctl -u containerd > containerd.log`


Collecting full debug bundle (optional):

```
curl -o must-gather.sh -L https://raw.githubusercontent.com/NVIDIA/gpu-operator/main/hack/must-gather.sh
chmod +x must-gather.sh
./must-gather.sh
```
**NOTE**: please refer to the [must-gather](https://raw.githubusercontent.com/NVIDIA/gpu-operator/main/hack/must-gather.sh) script for debug data collected.

This bundle can be submitted to us via email: **operator_feedback@nvidia.com**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

nvidia-cuda-validator Pod in nvidia-gpu-operator Namespace Stuck in Init:CrashLoopBackOff on OpenShift 4.18.22 #1848

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

nvidia-cuda-validator Pod in nvidia-gpu-operator Namespace Stuck in Init:CrashLoopBackOff on OpenShift 4.18.22 #1848

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions