Skip to content

Conversation

@nmn3m
Copy link

@nmn3m nmn3m commented Oct 18, 2025

Description

This PR fixes an issue where the kubelet-plugin init container fails to detect NVIDIA libraries installed in /usr/lib, causing the pod to remain stuck in Init:0/1 status indefinitely.

Fixes #692

After the fix:

$ kubectl get pods -n nvidia-dra-driver-gpu
NAME                                               READY   STATUS    RESTARTS   AGE
nvidia-dra-driver-gpu-controller-b65d7c4d9-rbr7t   1/1     Running   0          21m
nvidia-dra-driver-gpu-kubelet-plugin-vbjgc         2/2     Running   0          17s

Init container successfully completes:

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@nmn3m nmn3m force-pushed the fix/init-container-lib-search-path branch from bea88b6 to da46b95 Compare October 20, 2025 19:06
@nmn3m
Copy link
Author

nmn3m commented Oct 23, 2025

@klueska @jgehrcke , can you please take a look at this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

NVIDIA DRA Driver Kubelet Plugin Pod Stuck in Init:0/1 Status

1 participant