|
1 | 1 |
|
2 | 2 | ## Getting Started |
3 | 3 |
|
| 4 | +This guide covers creating and configuring `NodeReadinessRule` resources. |
| 5 | + |
| 6 | +> **Prerequisites**: Node Readiness Controller must be installed. See [Installation](./installation.md). |
| 7 | +
|
4 | 8 | ### API Spec |
5 | 9 |
|
6 | 10 | #### Example: Storage Readiness Rule (Bootstrap-only) |
@@ -43,82 +47,19 @@ spec: |
43 | 47 | | `nodeSelector` | Label selector to target specific nodes | No | |
44 | 48 | | `dryRun` | Preview changes without applying them | No | |
45 | 49 |
|
46 | | -### Deployment |
47 | | - |
48 | | -#### Option 1: Install official release images |
49 | | - |
50 | | -Node-Readiness Controller offers two variants of the container image to support different cluster architectures. |
51 | | - |
52 | | -Released container images are available for: |
53 | | -* **x86_64** (AMD64) |
54 | | -* **Arm64** (AArch64) |
55 | | - |
56 | | -The controller image is available in the Kubernetes staging registry: |
57 | | - |
58 | | -```sh |
59 | | -REPO="us-central1-docker.pkg.dev/k8s-staging-images/node-readiness-controller/node-readiness-controller" |
60 | | -
|
61 | | -TAG=$(skopeo list-tags docker://$REPO | jq .Tags[-1] | tr -d '"') |
62 | | -
|
63 | | -docker pull $REPO:$TAG |
64 | | -``` |
65 | | - |
66 | | -#### Option 2: Deploy Using Make Commands |
67 | | - |
68 | | -**Build and push your image to the location specified by `IMG_PREFIX`:`IMG_TAG` :** |
69 | | - |
70 | | -```sh |
71 | | -make docker-build docker-push IMG_PREFIX=<some-registry>/nrr-controller IMG_TAG=tag |
72 | | -``` |
73 | | - |
74 | | -```sh |
75 | | -# Install the CRDs |
76 | | -make install |
77 | | -
|
78 | | -# Deploy the controller |
79 | | -make deploy IMG_PREFIX=<some-registry>/nrr-controller IMG_TAG=tag |
80 | | -
|
81 | | -# Create sample rules |
82 | | -kubectl apply -k examples/network-readiness-rule.yaml |
83 | | -``` |
84 | | - |
85 | | -#### Option 3: Deploy Using Kustomize Directly |
86 | | - |
87 | | -```sh |
88 | | -# Install CRDs |
89 | | -kubectl apply -k config/crd |
90 | | -
|
91 | | -# Deploy controller and RBAC |
92 | | -kubectl apply -k config/default |
93 | | -
|
94 | | -# Create sample rules |
95 | | -kubectl apply -f examples/network-readiness-rule.yaml |
96 | | -``` |
97 | | - |
98 | | -### Uninstallation |
99 | | - |
100 | | -> **Important**: Follow this order to avoid stuck resources due to finalizers. |
101 | | - |
102 | | -The controller adds a finalizer (`readiness.node.x-k8s.io/cleanup-taints`) to each `NodeReadinessRule` to ensure node taints are cleaned up before the rule is deleted. This means you must delete CRs **while the controller is still running**. |
103 | | - |
104 | | -```sh |
105 | | -# 1. Delete all rule instances first (while controller is running) |
106 | | -kubectl delete nodereadinessrules --all |
107 | | -
|
108 | | -# 2. Delete the controller |
109 | | -make undeploy |
110 | | -
|
111 | | -# 3. Delete the CRDs |
112 | | -make uninstall |
113 | | -``` |
114 | | - |
115 | | -#### Recovering from Stuck Resources |
| 50 | +### Enforcement Modes |
116 | 51 |
|
117 | | -If you deleted the controller before removing the CRs, the finalizer will block CR deletion. To recover, manually remove the finalizer: |
| 52 | +#### Bootstrap-only Mode |
| 53 | +- Removes bootstrap taint when conditions are first satisfied |
| 54 | +- Marks completion with node annotation |
| 55 | +- Stops monitoring after successful removal (fail-safe) |
| 56 | +- Ideal for one-time setup conditions (storage, installing node daemons e.g: security agent or kernel-module update) |
118 | 57 |
|
119 | | -```sh |
120 | | -kubectl patch nodereadinessrule <rule-name> -p '{"metadata":{"finalizers":[]}}' --type=merge |
121 | | -``` |
| 58 | +#### Continuous Mode |
| 59 | +- Continuously monitors conditions |
| 60 | +- Adds taint when any condition becomes unsatisfied |
| 61 | +- Removes taint when all conditions become satisfied |
| 62 | +- Ideal for ongoing health monitoring (network connectivity, resource availability) |
122 | 63 |
|
123 | 64 | ## Operations |
124 | 65 |
|
@@ -162,19 +103,65 @@ Check dry run results: |
162 | 103 | kubectl get nodereadinessrule <rule-name> -o jsonpath='{.status.dryRunResults}' |
163 | 104 | ``` |
164 | 105 |
|
165 | | -### Enforcement Modes |
| 106 | +### Rule Validation and Constraints |
166 | 107 |
|
167 | | -#### Bootstrap-only Mode |
168 | | -- Removes bootstrap taint when conditions are first satisfied |
169 | | -- Marks completion with node annotation |
170 | | -- Stops monitoring after successful removal (fail-safe) |
171 | | -- Ideal for one-time setup conditions (storage, installing node daemons e.g: security agent or kernel-module update) |
| 108 | +#### NoExecute Taint Effect Warning |
| 109 | + |
| 110 | +**`NoExecute` with `continuous` enforcement mode will evict existing workloads when conditions fail.** |
| 111 | + |
| 112 | +If a critical component becomes temporarily unavailable (e.g., CNI daemon restart), all pods without matching tolerations are immediately evicted from the node. Use `NoSchedule` to prevent new scheduling without disrupting running workloads. |
| 113 | + |
| 114 | +The admission webhook warns when using `NoExecute`. |
| 115 | + |
| 116 | +See [Kubernetes taints documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) for taint behavior details. |
| 117 | + |
| 118 | +#### Avoiding Taint Key Conflicts |
| 119 | + |
| 120 | +The admission webhook prevents multiple rules from using the same `taint.key` and `taint.effect` on overlapping node selectors. |
| 121 | + |
| 122 | +**Example conflict:** |
| 123 | +```yaml |
| 124 | +# Rule 1 |
| 125 | +spec: |
| 126 | + nodeSelector: |
| 127 | + matchLabels: |
| 128 | + node-role.kubernetes.io/worker: "" |
| 129 | + taint: |
| 130 | + key: "readiness.k8s.io/network" |
| 131 | + effect: "NoSchedule" |
| 132 | +
|
| 133 | +# Rule 2 - This will be REJECTED |
| 134 | +spec: |
| 135 | + nodeSelector: |
| 136 | + matchLabels: |
| 137 | + node-role.kubernetes.io/worker: "" |
| 138 | + taint: |
| 139 | + key: "readiness.k8s.io/network" # Same key + effect = conflict |
| 140 | + effect: "NoSchedule" |
| 141 | +``` |
| 142 | + |
| 143 | +Use unique, descriptive taint keys for different readiness checks. |
| 144 | + |
| 145 | +#### Taint Key Naming |
| 146 | + |
| 147 | +Follow [Kubernetes naming conventions](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/). |
| 148 | + |
| 149 | +Taint keys must have the `readiness.k8s.io/` prefix to clearly identify readiness-related taints and avoid conflicts with other controllers |
| 150 | + |
| 151 | +**Valid:** |
| 152 | +```yaml |
| 153 | +taint: |
| 154 | + key: "readiness.k8s.io/NetworkReady" |
| 155 | + key: "readiness.k8s.io/StorageReady" |
| 156 | +``` |
| 157 | + |
| 158 | +**Invalid:** |
| 159 | +```yaml |
| 160 | +taint: |
| 161 | + key: "network-ready" # Missing prefix |
| 162 | + key: "node.kubernetes.io/ready" # Wrong prefix |
| 163 | +``` |
172 | 164 |
|
173 | | -#### Continuous Mode |
174 | | -- Continuously monitors conditions |
175 | | -- Adds taint when any condition becomes unsatisfied |
176 | | -- Removes taint when all conditions become satisfied |
177 | | -- Ideal for ongoing health monitoring (network connectivity, resource availability) |
178 | 165 |
|
179 | 166 | ## Configuration |
180 | 167 |
|
|
0 commit comments