Skip to content

ClusterClass Upgrade Fails Due to ManagedFields Behavior Change in CAPI v1.6 → v1.8 #11857

Open
@okozachenko1203

Description

@okozachenko1203

What steps did you take and what happened?

I encountered an issue when upgrading a Cluster to use a new ClusterClass after migrating from CAPI v1.6 → v1.8. The upgrade fails because old variables persist in spec.topology.variables, even though they should have been removed by Server-Side Apply (SSA).

Steps to Reproduce

  1. Create an initial ClusterClass (oldClass)

    • This ClusterClass contains a required variable: oldVariable.
  2. Create a Cluster using oldClass

    • The cluster’s spec.topology.variables includes oldVariable.
  3. Create a new ClusterClass (newClass)

    • newClass has a required variable newVariable instead of oldVariable.
    • Only the name differs; the format and value remain the same.
  4. Upgrade the cluster to use newClass via SSA patch

    • Expected result: oldVariable should be removed, and newVariable should be added.

Scenarios Tested

Scenario 1: Running on CAPI v1.6.x

Deployment Versions:

NAMESPACE                           NAME                       VERSION  
capi-kubeadm-bootstrap-system       bootstrap-kubeadm          v1.6.0  
capi-kubeadm-control-plane-system   control-plane-kubeadm      v1.6.0  
capi-system                         cluster-api                v1.6.0  
capo-system                         infrastructure-openstack   v0.9.0  

Result:

  • The upgrade works correctly.
  • After upgrading the ClusterClass, only newVariable exists, and oldVariable is removed.

Scenario 2: Running on CAPI v1.8.x

Deployment Versions:

NAMESPACE                           NAME                       VERSION  
capi-kubeadm-bootstrap-system       bootstrap-kubeadm          v1.8.4  
capi-kubeadm-control-plane-system   control-plane-kubeadm      v1.8.4  
capi-system                         cluster-api                v1.8.4  
capo-system                         infrastructure-openstack   v0.11.2  

Result:

  • The upgrade works correctly.
  • After upgrading the ClusterClass, only newVariable exists, and oldVariable is removed.

Scenario 3: Upgrading from CAPI v1.6 → v1.8 and then upgrading ClusterClass

  1. Deploy Cluster using CAPI v1.6.0 (Scenario 1 setup).
  2. Upgrade CAPI to v1.8.4 (Scenario 2 setup).
  3. Attempt to upgrade Cluster to use newClass (SSA patch).
    • 🔴 The upgrade fails because oldVariable still exists, despite not being defined in newClass.

What did you expect to happen?

  • When upgrading a cluster to use a new ClusterClass, SSA should correctly remove old variables that are no longer part of the new ClusterClass.
  • This behavior should remain consistent across CAPI versions.

Cluster API version

I mentioned the cluster-api versions in the scenario above.

Kubernetes version

I tested in these versions.
1.25.x
1.28.x

Anything else you would like to add?

What I Found

managedFields behaves differently between versions.

  • In CAPI v1.6.x (Older versions), spec.topology.variables was tracked as a single field:
    f:variables: {}
  • In CAPI v1.8.x (Newer versions), each variable in spec.topology.variables is individually tracked:
    f:variables:
      k:{"name":"apiServerTLSCipherSuites"}:
        .: {}
        f:name: {}
        f:value: {}

Hypothesis:

  • In CAPI v1.6, SSA does not track individual variables, so removing a variable implicitly removes it from spec.topology.variables.
  • In CAPI v1.8, SSA tracks each variable separately, preventing removal if ownership conflicts exist.
  • This change breaks upgrades when transitioning from v1.6 to v1.8.

Possible Causes

  1. Changes in SSA handling of spec.topology.variables between CAPI v1.6 → v1.8.
  2. Stricter managedFields tracking in newer versions.
  3. Potential ownership conflicts preventing removal of fields.

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

Metadata

Metadata

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions