nvidia drivers and nfs-utils not compatible together talos 1.12.0 - 1.12.2 #12658
Unanswered
epictralala
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Talso 1.12.0 - 1.12.2
Description
We are running Talos on bare metal/VMware. Whenever we deploy Talos with NVIDIA drivers (open or proprietary), the nfs-utils service fails to start rpc-statd. Our setup involves using NetApp with Trident for RWX volumes to store LLM data.
Observation:
Problem
The nfs-utils service cannot run in parallel with NVIDIA drivers because the nvidia-container-toolkit provides /usr/local/lib/libcap.so.2. This library is loaded by extra-rpc-statd from nfs-utils, even though nfs-utils does not require or use /usr/local/lib/libcap.so.
Question:
Is there a way to prevent nfs-utils from loading /usr/local/lib/libcap.so.2?
Extensions
intel-ucode 20251111
iscsi-tools v0.2.0
multipath-tools v0.0.1
nfs-utils v0.1.1
nfsd v1.12.2
nvidia-container-toolkit-lts 580.126.09-v1.18.1
nvidia-open-gpu-kernel-modules-lts 580.126.09-v1.12.2
trident-iscsi-tools v0.0.1
util-linux-tools 2.41.2
schematic 9b4b72f0bcafb0012890bfc8c1aef54b9e8814965cece511f9ae5081f5645167
modules.dep 6.18.5-talos
dmesg
this 2 lines repeat every second in Dashboard and dmesg:
192.168.0.10: user: warning: [2026-01-23T17:22:37.610643189Z]: [talos] serviceext-rpc-statd: Started task ext-rpc-statd (PID 232589) for container ext-rpc-statd
192.168.0.10: user: warning: [2026-01-23T17:22:37.636555189Z]: [talos] serviceext-rpc-statd: Error running Containerd(ext-rpc-statd), going to restart forever: task "ext-rpc-statd" failed: exit code 127 (last log "Error relocating /usr/local/lib/libcap.so.2: __isoc23_strtoul: symbol not found")
Beta Was this translation helpful? Give feedback.
All reactions