-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
This is my deployment:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nvidia-exporter
namespace: monitoring
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: nvidia-exporter
spec:
containers:
- name: nvidia-exporter
securityContext:
privileged: true
image: bugroger/nvidia-exporter:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 9401
volumeMounts:
- mountPath: /usr/local/nvidia
name: nvidia
volumes:
- name: nvidia
hostPath:
path: /home/zy/cuda
when I exec into nvidia-exporter and run
ls /usr/local/nvidia/lib64
there exists libnvidia-ml.so.1
But the container logs always show
Failed to collect metrics: could not load NVML library
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels