Description
What happened:
On the first boot, no CNI binary is on the node, and so k8s creates the /var/run/azure-vnet directory with 0755 permissions automatically because it is a mount part of the azure-cns daemonset. Then the CNI is deployed.
The /var/run directory is not preserved between reboots.
Then, when the VM reboots, the CNI binary may run before k8s creates the /var/run/azure-vnet directory. When the CNI binary runs first, it creates the directory with 0644 permissions. This causes permission denied errors for the cns. Even if k8s creates/mounts the /var/run/azure-vnet directory later, it will see it already exists and won't recreate the directory with the 0755 permissions.
What you expected to happen:
The CNI binary should create the directory with 0755 permissions.
How to reproduce it:
Reboot the VM with the cns capabilities security context dropping all capabilities (so it doesn't bypass permission checks). There is a chance that the azure-cns pod will get stuck in crash loop backoff.
Orchestrator and Version (e.g. Kubernetes, Docker):
Operating System (Linux/Windows):
Kernel (e.g. uanme -a
for Linux or $(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion
for Windows):
Anything else we need to know?:
[Miscellaneous information that will assist in solving the issue.]
Activity