AKS Azure Fileshare CSI PV/PVC Suddenly producing "Read-Only" errors #2141
Open
Description
What happened:
We have a PV and associated PVC, shown below. We have a pod which mounts it and writes to it. It has been working for a long time with no issues, and recently it started to throw the following error:
---> System.IO.IOException: Read-only file system :
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: {{ .Values.fileshare.micsiName }}
provisioner: file.csi.azure.com
allowVolumeExpansion: true
parameters:
shareName: {{ .Values.fileshare.shareName }}
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict # https://linux.die.net/man/8/mount.cifs
- nosharesock
- actimeo=30
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: {{ .Values.fileshare.geospatial.mipvName }}
spec:
capacity:
storage: {{ .Values.fileshare.capacity }}
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: {{ .Values.fileshare.micsiName }}
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict # https://linux.die.net/man/8/mount.cifs
- nosharesock
- actimeo=30
csi:
driver: file.csi.azure.com
readOnly: false
volumeHandle: {{ .Values.fileshare.geospatial.mivolumeHandleName }}
volumeAttributes:
resourceGroup: {{ .Values.environment.resourcegroup }}
storageAccount: {{ .Values.fileshare.storageAccountName }}
shareName: {{ .Values.fileshare.shareName }}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ .Values.fileshare.geospatial.mipvcName }}
spec:
accessModes:
- ReadWriteMany
storageClassName: {{ .Values.fileshare.micsiName }}
resources:
requests:
storage: {{ .Values.fileshare.capacity }}
volumeName: {{ .Values.fileshare.geospatial.mipvName }}
What you expected to happen:
We expect the pod to continue to write no problem.
How to reproduce it:
Not sure at this point, failure seemed to occur randomly between regions.
Anything else we need to know?:
This is running fine in certain regions, and producing this error in others. It is not consistent.
Allow Blob anonymous access is enabled
Allow storage account key access is enabled
The fileshare is SMB.
Environment:
- CSI Driver version: image: mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.30.5
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:39:03Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.8", GitCommit:"234bc63696ad15dcf62584b6ba48671bf0f25fb6", GitTreeState:"clean", BuildDate:"2024-08-15T17:13:33Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a
):
Kernel Version: 5.15.164.1-1.cm2
OS Image: CBL-Mariner/Linux
- Install tools: Followed docs at:
https://learn.microsoft.com/en-us/azure/aks/azure-files-csi
- Others: