@@ -630,6 +630,8 @@ use container runtime versions that have the needed changes.
630
630
631
631
- Gather and address feedback from the community
632
632
- Be able to configure UID/GID ranges to use for pods
633
+ - Add unit tests that exercise the feature gate switch (see section "Are there
634
+ any tests for feature enablement/disablement?")
633
635
- This feature is not supported on Windows.
634
636
- Get review from VM container runtimes maintainers (not blocker, as VM runtimes should just ignore
635
637
the field, but nice to have)
@@ -670,6 +672,26 @@ enhancement:
670
672
CRI or CNI may require updating that component before the kubelet.
671
673
-->
672
674
675
+ #### Kubelet and Kube-apiserver skew
676
+
677
+ The apiserver and kubelet feature gate enablement work fine in any combination:
678
+
679
+ 1 . If the apiserver has the feature gate enabled and the kubelet doesn't, then the pod will show
680
+ that field and the kubelet will ignore it. Then, the pod is created without user namespaces.
681
+ 1 . If the apiserver has the feature gate disabled and the kubelet enabled, the pod won't show this
682
+ field and therefore the kubelet won't act on a field that isn't shown. The pod is created with
683
+ the feature gate disabled.
684
+
685
+ The kubelet can still create pods with user namespaces if static-pods are configured with
686
+ pod.spec.hostUsers and has the feature gate enabled.
687
+
688
+ If the kube-apiserver doesn't support the feature at all (< 1.25), a pod with userns will be
689
+ rejected.
690
+
691
+ If the kubelet doesn't support the feature (< 1.25), it will ignore the pod.spec.hostUsers field.
692
+
693
+ #### Kubelet and container runtime skews
694
+
673
695
Some definitions first:
674
696
- New kubelet: kubelet with CRI proto files that includes the changes proposed in
675
697
this KEP.
@@ -794,6 +816,9 @@ We will also unit test that, if pods were created with the new field
794
816
pod.specHostUsers, then if the featuregate is disabled all works as expected (no
795
817
user namespace is used).
796
818
819
+ We will add tests exercising the ` switch ` of feature gate itself (what happens
820
+ if I disable a feature gate after having objects written with the new field)
821
+
797
822
<!--
798
823
The e2e framework does not currently support enabling or disabling feature
799
824
gates. However, unit tests in each component dealing with managing data, created
@@ -815,16 +840,18 @@ This section must be completed when targeting beta to a release.
815
840
816
841
###### How can a rollout or rollback fail? Can it impact already running workloads?
817
842
818
- The rollout is just a feature flag on the kubelet and the kube-apiserver.
843
+ If one APIserver is upgraded while other's aren't and you are talking to a not
844
+ upgraded one, the pod will be accepted (if the apiserver is >= 1.25, rejected if
845
+ < 1.25).
819
846
820
- If one APIserver is upgraded while other's aren't and you are talking to a not upgraded the pod
821
- will be accepted (if the apiserver is >= 1.25). If it is scheduled to a node that the kubelet has
822
- the feature flag activated and the node meets the requirements to use user namespaces, then the
823
- pod will be created with the namespace. If it is scheduled to a node that has the feature disabled,
824
- it will be created without the user namespace.
847
+ If it is scheduled to a node where the kubelet has the feature flag activated
848
+ and the node meets the requirements to use user namespaces, then the pod will be
849
+ created with the namespace. If it is scheduled to a node that has the feature
850
+ disabled, it will be created without the user namespace.
825
851
826
- On a rollback, pods created while the feature was active (created with user namespaces) will have to
827
- be re-created without user namespaces.
852
+ On a rollback, pods created while the feature was active (created with user
853
+ namespaces) will have to be re-created to run without user namespaces. If those
854
+ weren't recreated, they will continue to run in a user namespace.
828
855
829
856
<!--
830
857
Try to be as paranoid as possible - e.g., what if some components will restart
@@ -841,8 +868,15 @@ will rollout across nodes.
841
868
On Kubernetes side, the kubelet should start correctly.
842
869
843
870
On the node runtime side, a pod created with pod.spec.hostUsers=false should be on RUNNING state if
844
- all node requirements are met. If the CRI runtime or the handler do not support the feature, the kubelet
845
- returns an error.
871
+ all node requirements are met. If the CRI runtime or the handler do not support the feature, the
872
+ kubelet returns an error.
873
+
874
+ When a pod hits this error returned by the kubelet, the status in ` kubectl ` is shown as
875
+ ` ContainerCreating ` and the pod events shows:
876
+
877
+ ```
878
+ Warning FailedCreatePodSandBox 12s (x23 over 5m6s) kubelet Failed to create pod sandbox: user namespaces is not supported by the runtime
879
+ ```
846
880
847
881
<!--
848
882
What signals should users be paying attention to when the feature is young
@@ -1201,7 +1235,8 @@ No changes to current kubelet behaviors. The feature only uses kubelet-local inf
1201
1235
levels that could help debug the issue?
1202
1236
Not required until feature graduated to beta.
1203
1237
1204
- The error is returned on pod-creation, no need to search for logs.
1238
+ The Kubelet will get an error from the runtime and will propagate it to the pod (visible on
1239
+ the pod events).
1205
1240
1206
1241
The idmap mount is created by the OCI runtime, not at the kubelet layer. At the kubelet layer, this
1207
1242
is just another OCI runtime error.
@@ -1231,7 +1266,8 @@ No changes to current kubelet behaviors. The feature only uses kubelet-local inf
1231
1266
- Detection: How can it be detected via metrics? Stated another way:
1232
1267
how can an operator troubleshoot without logging into a master or worker node?
1233
1268
1234
- Errors are returned on pod creation, directly to the user. No need to use metrics.
1269
+ Errors are returned on pod creation, directly to the user (visible on the pod events). No
1270
+ need to use metrics.
1235
1271
1236
1272
See the pod events, it should contain something like:
1237
1273
@@ -1249,7 +1285,7 @@ No changes to current kubelet behaviors. The feature only uses kubelet-local inf
1249
1285
levels that could help debug the issue?
1250
1286
Not required until feature graduated to beta.
1251
1287
1252
- No extra logs, the error is returned to the user.
1288
+ No extra logs, the error is returned to the user (visible in the pod events) .
1253
1289
1254
1290
- Testing: Are there any tests for failure mode? If not, describe why.
1255
1291
@@ -1262,8 +1298,8 @@ writing to this file.
1262
1298
- Detection: How can it be detected via metrics? Stated another way:
1263
1299
how can an operator troubleshoot without logging into a master or worker node?
1264
1300
1265
- Errors are returned to the operation failed (like pod creation), no need to see metrics nor
1266
- logs.
1301
+ Errors are returned to the operation failed (like pod creation, visible on the pod events),
1302
+ no need to see metrics nor logs.
1267
1303
1268
1304
Errors are returned to the either on:
1269
1305
* Kubelet initialization: the initialization fails if the feature gate is active and there is a
0 commit comments