Skip to content

Add Kubernetes mitigation manifest#29

Open
ClemDNL wants to merge 1 commit into
V4bel:masterfrom
ClemDNL:add-kubernetes-mitigation
Open

Add Kubernetes mitigation manifest#29
ClemDNL wants to merge 1 commit into
V4bel:masterfrom
ClemDNL:add-kubernetes-mitigation

Conversation

@ClemDNL

@ClemDNL ClemDNL commented May 8, 2026

Copy link
Copy Markdown

Summary

Adds a self-contained Kubernetes DaemonSet manifest under k8s/ that applies the mitigation from the README to every Linux node in a cluster, and re-applies it automatically on any new node that joins (autoscaling, node-image upgrade, scale-set rolling update).

This is helpful because the disclosure's recommended mitigation must run on every node, and on managed-Kubernetes platforms (AKS / EKS / GKE) operators don't typically have direct shell access to nodes — a DaemonSet is the cleanest way to roll a host-level fix across the fleet.

Files

  • k8s/dirtyfrag-mitigation.yaml — single-file manifest (ServiceAccount + ClusterRole + ClusterRoleBinding + DaemonSet) applyable in one shot with kubectl apply -f. Uses a privileged init container that nsenter's into PID 1 to:

    1. Write /etc/modprobe.d/disable-dirtyfrag.conf blacklisting esp4, esp6, rxrpc.
    2. For each module loaded with refcnt=0, run modprobe -r to unload it from the live kernel.
    3. Run sync; echo 3 > /proc/sys/vm/drop_caches (per the README's mitigation guidance; gated on DROP_CACHES, default true).
    4. For any module that remains loaded with refcnt > 0, emit a single aggregated Warning Kubernetes Event (reason=DirtyFragModulesInUse) on the affected Node listing the in-use modules so operators can drain & reboot. No auto-cordon — operators decide when to recycle nodes.

    A long-running pause container keeps the pod in Running so the init container is only re-executed on pod recreation (i.e. on each new node).

  • k8s/README.md — apply / verify / revert instructions, plus the compatibility note that esp4/esp6 = IPsec ESP and rxrpc = AFS RxRPC sockets, and pointers for clusters that need to keep one of these modules.

  • README.md — adds a short Kubernetes subsection inside Mitigation pointing to k8s/.

Apply

kubectl apply -f https://raw.githubusercontent.com/V4bel/dirtyfrag/master/k8s/dirtyfrag-mitigation.yaml
kubectl -n kube-system rollout status ds/dirtyfrag-mitigation
kubectl -n default get events --field-selector reason=DirtyFragModulesInUse

Revert (when upstream patches roll out)

The modprobe drop-in persists for the lifetime of each node, so a cleanup pass is provided:

kubectl -n kube-system set env ds/dirtyfrag-mitigation CLEANUP_MODE=true
kubectl -n kube-system rollout restart ds/dirtyfrag-mitigation
kubectl -n kube-system rollout status  ds/dirtyfrag-mitigation
kubectl delete -f https://raw.githubusercontent.com/V4bel/dirtyfrag/master/k8s/dirtyfrag-mitigation.yaml

Validation

  • kubectl apply --dry-run=server -f k8s/dirtyfrag-mitigation.yaml against a real cluster — all four resources (ServiceAccount, ClusterRole, ClusterRoleBinding, DaemonSet) accepted.
  • The manifest has been deployed in production on AKS where it is currently running across dev / staging / prod / backup-prod node pools (a slightly different variant — this PR is the upstream-friendly generalization).

Compatibility note for reviewers

esp4/esp6 provide IPsec ESP transforms; rxrpc provides the RxRPC socket family. None of these are required by a typical workload-only Kubernetes cluster, but operators with node-level IPsec or AFS clients should edit the MODULES env var (or label-exclude the affected node pool) before applying. This is documented in k8s/README.md.

Adds a self-contained DaemonSet manifest under k8s/ that applies the
mitigation from the README (modprobe blacklist of esp4/esp6/rxrpc +
page-cache flush) to every Linux node in a Kubernetes cluster, and
re-applies it automatically on any new node that joins the cluster
(autoscaling, node-image upgrade, scale-set rolling update).

  - k8s/dirtyfrag-mitigation.yaml — single-file manifest applyable with
    kubectl apply -f. Uses an init container that nsenter's into PID 1
    to write /etc/modprobe.d/disable-dirtyfrag.conf, modprobe -r each
    module that has refcnt=0, and echo 3 > /proc/sys/vm/drop_caches.
    For any module that remains loaded with refcnt > 0, emits a single
    aggregated Warning Kubernetes Event on the Node (no auto-cordon).
    A long-running pause container keeps the pod Running so the init
    container is only re-executed on pod recreation.
  - k8s/README.md — apply / verify / revert instructions and
    compatibility notes (esp4/esp6 = IPsec, rxrpc = AFS).
  - README.md — short Kubernetes section in Mitigation pointing to k8s/.

Tested on AKS (Azure) running Kubernetes 1.30, in a production
environment across staging and production clusters.
@ClemDNL ClemDNL force-pushed the add-kubernetes-mitigation branch from 02ca786 to 44af5b1 Compare May 8, 2026 08:41

if [ "${CLEANUP_MODE}" = "true" ]; then
echo "[dirtyfrag] CLEANUP mode on node ${NODE_NAME}: removing mitigation"
nsenter -t 1 -m -u -i -n -p -- sh -c "rm -f ${MODPROBE_FILE}; depmod -a 2>/dev/null || true; for m in ${MODULES}; do modprobe -r \$m 2>/dev/null || true; done; true"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for entering all host namespaces here?

Writing configuration for modprobe could he achieved via a hostMount to /etc/modprobe.d and module management can be achieved by granting the container the SYS_MODULE capability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants