Skip to content

During core20 refresh / channel switch, apiserver-kicker falsely detects changes and restarts all MicroK8s containers #5346

@salmon111

Description

@salmon111

Summary

In a MicroK8s environment installed via snap (classic), we observed an issue where
microk8s.daemon-apiserver-kicker falsely detects a certificate change during a core20 base snap refresh or channel switch,
causing key MicroK8s components such as kube-apiserver and containerd to restart.

As a result, all containers running on the affected node are restarted.

Although this issue is timing-dependent, we confirmed that it can be reproduced with a high probability by repeatedly performing core20 refresh or channel switching operations.

What Should Happen Instead?

Refreshing core20 or switching its channel should not affect the certificate state of MicroK8s.
microk8s.daemon-apiserver-kicker should not treat such operations as certificate changes,
and kube-apiserver or containerd should not be restarted as a result.

Even if binaries inside the snap become temporarily non-executable during base snap mount/unmount operations,
this situation should be handled safely and should not be interpreted as a configuration or certificate change.

Reproduction Steps

This issue is timing-dependent, but we confirmed it can be reproduced with a high probability by repeating the steps below.

  1. Prepare a node running MicroK8s installed via snap (classic).
sudo snap install microk8s --classic
sudo snap refresh --hold microk8s
  1. Confirm that core20 is installed as a base snap.
$ snap list
Name      Version   Rev    Tracking       Publisher   Notes
core20    20251031  2686   latest/stable  canonical✓  base
microk8s  v1.32.9   8511   1.32/stable    canonical✓  classic,held
snapd     2.72      25577  latest/stable  canonical✓  snapd
  1. Ensure MicroK8s is running normally and pods/containers are active on the node.

  2. Perform a core20 refresh or channel switch.

sudo snap refresh core20 --channel=latest/stable
sudo snap refresh core20 --channel=latest/edge
  1. Repeat the above operation until the issue reproduces.
    Due to timing dependency, multiple attempts may be required.
    In our testing, reproduction occurred after approximately 60 attempts.

  2. While core20 is being mounted/unmounted, microk8s.daemon-apiserver-kicker runs.

  3. At this timing, binaries inside the snap temporarily become non-executable, and errors like the following are logged.

/snap/microk8s/8511/bin/ip: cannot execute: required file not found
/snap/microk8s/8511/bin/hostname: cannot execute: required file not found
/snap/microk8s/8511/usr/bin/gawk: cannot execute: required file not found
  1. Even though no actual certificate change has occurred,
    microk8s.daemon-apiserver-kicker interprets this situation as a certificate change and triggers restarts of kube-apiserver and containerd.

  2. As a result, all containers running on the node are restarted.

Introspection Report

inspection-report-20251218_124437.tar.gz

Can you suggest a fix?

At this time, we do not have a concrete proposal for a fix.
This issue is intended to report the observed behavior, reproduction conditions, and impact.

Are you interested in contributing with a fix?

Yes.
We are happy to help with additional investigation, testing, or validation if needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions