-
Notifications
You must be signed in to change notification settings - Fork 21
Description
While integrating Node Readiness Controller (NRC) with Cilium-managed clusters, we observed potential overlap in bootstrap taint management.
Background:
When running Node Readiness Controller (NRC) in clusters that use Cilium as the CNI, Cilium currently manages a bootstrap taint via a flag:
node.cilium.io/agent-not-ready:NoSchedule
Behavior:
- Cilium adds this taint during node initialization.
- It removes the taint once the Cilium agent is ready.
- It also sets the built-in Kubernetes node condition:
NetworkUnavailable=False
Configuration Option:
The taint key is configurable via: --agent-not-ready-taint-key
This allows overriding the default key (e.g. to readiness.k8s.io/network-not-ready).
Potential Overlap:
If users configure NRC to enforce network readiness via taints and Cilium’s bootstrap taint management remains enabled, then both components may attempt to manage taints for the same network readiness domain.
One possible approach to resolve this would be:
If users choose to manage this taint with NRC, they could disable Cilium’s taint management component so that NRC becomes the sole manager of the network readiness taint.
Motivation:
We are working toward standardizing readiness taints under a fixed prefix (e.g. readiness.k8s.io/*) so that:
- Readiness taints can be categorized consistently.
- Autoscaling components (e.g. Cluster Autoscaler) can treat readiness taints differently from permanent taints.
- We avoid per-vendor allow-listing in autoscaling configurations.
- UX improves when integrating readiness enforcement across components.
Since Cilium already allows overriding the taint key via --agent-not-ready-taint-key it appears technically possible to align the bootstrap taint with a standardized readiness prefix without requiring changes to Cilium itself.