Describe the bug
When you have a Kubernetes cluster with an application that mounts the host directory /var/lib in read-only mode, it leads to problems with the Mayastor remount process. The Mayastor CSI driver can't unmount the disk because the operating system sees that some processes still do not allow unmounting /var/lib/kubelet/plugins/kubernetes.io/csi/io.openebs.csi-mayastor/{volume_id}/globalmount, even if the globalmount looks like unmounted, because the directory is empty, but the ext4 journaling process still exists, and the NVMe device still exists.
If you will delete the /var/lib/kubelet/plugins/kubernetes.io/csi/io.openebs.csi-mayastor/{volume_id} directory manually with the rm -r command, after this, the ext4 journaling process will stop, and the CSI driver can continue unmounting.
To Reproduce
-
Install the Vector log aggregator application into the Kubernetes cluster via Helm chart:
( By default, Vector will mount /var/lib for some reason and perhaps will try to read the entry )
https://github.com/vectordotdev/helm-charts/blob/23f60fec2332b20a301796c80bf7c5b49b383045/charts/vector/values.yaml#L353
-
Try to create pods with volumes and try to restart them. For example, 30 pods with volumes for three times.
Some parts of the pods ( like 20% or more ) will be stuck in the pending phase, and you will see the error proc entry still exists in the CSI driver logs.
Script to find empty directories on the node:
find /var/lib/kubelet/plugins/kubernetes.io/csi/io.openebs.csi-mayastor -type d -name "globalmount" -empty -exec dirname {} \;
Script to find out which process is still trying to use those empty directories by NVMe device ID:
DEVICE='nvme0n1'
for pid in $(ls /proc | grep '^[0-9]\+$'); do
if [ -r /proc/$pid/mounts ]; then
if grep -q ${DEVICE} /proc/$pid/mounts; then
echo "Mounted in PID $pid:"
grep ${DEVICE} /proc/$pid/mounts
fi
fi
done
Expected behaviour
It looks like understandable behaviour, but it can be a bit tricky to investigate. + If we talk, for example, about the AWS CSI driver for EBS volumes, I can't reproduce this kind of behaviour on AWS. So I'm not sure, but maybe it is possible to fix it at the Mayastor code level to mitigate behaviour like this.
OS info:
- Distro: Ubuntu 24.04
- Kernel version: 6.14.0-29-generic
- MayaStor revision or container image: 2.9.2
Describe the bug
When you have a Kubernetes cluster with an application that mounts the host directory
/var/libin read-only mode, it leads to problems with the Mayastor remount process. The Mayastor CSI driver can't unmount the disk because the operating system sees that some processes still do not allow unmounting/var/lib/kubelet/plugins/kubernetes.io/csi/io.openebs.csi-mayastor/{volume_id}/globalmount, even if the globalmount looks like unmounted, because the directory is empty, but the ext4 journaling process still exists, and the NVMe device still exists.If you will delete the
/var/lib/kubelet/plugins/kubernetes.io/csi/io.openebs.csi-mayastor/{volume_id}directory manually with therm -rcommand, after this, the ext4 journaling process will stop, and the CSI driver can continue unmounting.To Reproduce
Install the Vector log aggregator application into the Kubernetes cluster via Helm chart:
( By default, Vector will mount
/var/libfor some reason and perhaps will try to read the entry )https://github.com/vectordotdev/helm-charts/blob/23f60fec2332b20a301796c80bf7c5b49b383045/charts/vector/values.yaml#L353
Try to create pods with volumes and try to restart them. For example, 30 pods with volumes for three times.
Some parts of the pods ( like 20% or more ) will be stuck in the pending phase, and you will see the error
proc entry still existsin the CSI driver logs.Script to find empty directories on the node:
Script to find out which process is still trying to use those empty directories by NVMe device ID:
Expected behaviour
It looks like understandable behaviour, but it can be a bit tricky to investigate. + If we talk, for example, about the AWS CSI driver for EBS volumes, I can't reproduce this kind of behaviour on AWS. So I'm not sure, but maybe it is possible to fix it at the Mayastor code level to mitigate behaviour like this.
OS info: