🐛 Fix ClusterClass MachinePool rollout on BootstrapConfig changes#13110
🐛 Fix ClusterClass MachinePool rollout on BootstrapConfig changes#13110bnallapeta wants to merge 1 commit intokubernetes-sigs:mainfrom
Conversation
Signed-off-by: Bharath Nallapeta <nr.bharath97@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@richardcase @AndiDog |
| desired: desiredMP.BootstrapObject, | ||
| }); err != nil { | ||
| if createdInfra { | ||
| infrastructureMachinePoolCleanupFunc = func() { |
There was a problem hiding this comment.
Just curious, why are we creating an inline function instead of defining it somewhere else? Not as familiar with Kubernetes code bases so disregard if there there's a good reason.
There was a problem hiding this comment.
a. I'd like to stay consistent with the current style where this is used in multiple other places already.
b. I think inline functions help us keep the logic right where we need to make the decision (should we cleanup or not). The alternative would require us to send all the context, obj, logger and handle the errors differently and this feels a bit too much for the purpose.
|
Just finished testing the fix manually with CAPD and it's working as expected! Summary:
$ kubectl get kubeadmconfig -l cluster.x-k8s.io/cluster-name=test-cluster
NAME AGE
test-cluster-mp-0-2krrr 58m # old config
test-cluster-mp-0-sw2m8 2m # new config (rotated!)
$ kubectl get machinepool $MACHINEPOOL_NAME -o jsonpath='{.spec.template.spec.bootstrap.configRef.name}'
test-cluster-mp-0-sw2m8 # <-- reference updated correctlyWhy CAPD? Next steps: (Detailed testing steps available if anyone wants to reproduce) |
| infraLog := log.WithValues(desiredMP.InfrastructureMachinePoolObject.GetKind(), klog.KObj(desiredMP.InfrastructureMachinePoolObject)) | ||
| infraCtx := ctrl.LoggerInto(ctx, infraLog) | ||
| infrastructureMachinePoolCleanupFunc := func() {} | ||
| createdInfra, err := r.reconcileReferencedTemplate(infraCtx, reconcileReferencedTemplateInput{ |
There was a problem hiding this comment.
Neither the InfraMachinePool nor the BootstrapConfig is a template.
Will this rotate the template? If yes, is it actually supported by infra providers that we rotate the InfraMachinePool object? (& standardized in the contract)
There was a problem hiding this comment.
I think this warrants a bigger discussion. Last time I talked about this, we were leaning towards updating these things in the controller docs and NOT in the contract (ref).
I am going to do bring this up again, discuss, and update the contract/controller docs and then get back to this. Thank you.
| bootstrapLog := log.WithValues(desiredMP.BootstrapObject.GetKind(), klog.KObj(desiredMP.BootstrapObject)) | ||
| bootstrapCtx := ctrl.LoggerInto(ctx, bootstrapLog) | ||
| bootstrapCleanupFunc := func() {} | ||
| createdBootstrap, err := r.reconcileReferencedTemplate(bootstrapCtx, reconcileReferencedTemplateInput{ |
There was a problem hiding this comment.
Infrastructure providers (like CAPA) watch for configRef.name changes to trigger rollouts, but since the name never changed, no rollout occurred.
Is this standardized via the MachinePool contract?
There was a problem hiding this comment.
No, it's not explicitly stated in the contract. It made sense to do this in CAPA as we needed some way to roll out changes to KubeadmConfig (where suffixing the object name with a hash is the easiest solution at least with Helm).
What this PR does / why we need it:
ClusterClass-managed MachinePools were not triggering node rollouts when BootstrapConfig or InfrastructureMachinePool templates changed. This happened because
updateMachinePool()usedreconcileReferencedObject()which patches objects in-place without changing their names. Infrastructure providers (like CAPA) watch forconfigRef.namechanges to trigger rollouts, but since the name never changed, no rollout occurred.This PR changes
updateMachinePool()to usereconcileReferencedTemplate()instead. This is the same approach used byupdateMachineDeployment().Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #10496
/area machinepool
/area clusterclass