Skip to content

🐛 Add MachinesUpToDate condition handling in MachinePool controller#13234

Open
ramessesii2 wants to merge 1 commit intokubernetes-sigs:mainfrom
ramessesii2:RAMESSES/fix-capi-10059
Open

🐛 Add MachinesUpToDate condition handling in MachinePool controller#13234
ramessesii2 wants to merge 1 commit intokubernetes-sigs:mainfrom
ramessesii2:RAMESSES/fix-capi-10059

Conversation

@ramessesii2
Copy link

@ramessesii2 ramessesii2 commented Jan 14, 2026

What this PR does / why we need it:
MachinePool lacks clear status signal during upgrades. When spec.version changes (generation increments), status.observedGeneration updates immediately to match, but Ready/InfrastructureReady conditions only flip to False ~10s later. See the linked Github issue for more details.

Following the pattern established in MachineDeployment, this PR adds a MachinesUpToDate condition to MachinePool that provides an immediate, reliable signal for upgrade status:

  • For MachinePools with Machine objects: Aggregates MachineUpToDateCondition from individual Machines
  • For managed MachinePools (like AKS): Derives the condition from InfrastructureProvisioned status
  • 10-second grace period: Filters out very new Machines to prevent condition flickering
See the new condition in the MachinePool status
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2026-01-14T12:39:46Z"
    message: ""
    observedGeneration: 9
    reason: UpToDate
    status: "True"
    type: MachinesUpToDate
  - lastTransitionTime: "2026-01-14T11:36:09Z"
    message: ""
    observedGeneration: 9
    reason: NotPaused
    status: "False"
    type: Paused
  deprecated:
    v1beta1:
      availableReplicas: 1
      conditions:
      - lastTransitionTime: "2026-01-14T12:39:46Z"
        status: "True"
        type: Ready
      - lastTransitionTime: "2026-01-14T11:36:09Z"
        status: "True"
        type: BootstrapReady
      - lastTransitionTime: "2026-01-14T12:39:46Z"
        status: "True"
        type: InfrastructureReady
      - lastTransitionTime: "2026-01-14T11:45:50Z"
        status: "True"
        type: ReplicasReady
      readyReplicas: 1
  initialization:
    bootstrapDataSecretCreated: true
    infrastructureProvisioned: true
  nodeRefs:
  - apiVersion: v1
    kind: Node
    name: aks-pool2-34400930-vmss000000
    uid: 953a05d8-8b81-421e-bfca-4c500f3c0f68
  observedGeneration: 9
  phase: Running
  readyReplicas: 1
  replicas: 1
  upToDateReplicas: 1

Fixes #10059

/area machinepool

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. area/machinepool Issues or PRs related to machinepools labels Jan 14, 2026
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 14, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign justinsb for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @ramessesii2. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 14, 2026
@ramessesii2 ramessesii2 force-pushed the RAMESSES/fix-capi-10059 branch from 70ae825 to 0952e72 Compare January 14, 2026 11:28
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jan 14, 2026
conditions.Set(mp, metav1.Condition{
Type: clusterv1.MachinesUpToDateCondition,
Status: metav1.ConditionFalse,
Reason: clusterv1.NotUpToDateReason,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this, would suggest using MachinePool specific constant. I already see MachineDeploymentMachinesNotUpToDateReason and other existing. If something is not there, you can add a new constant to keep it consistent with others.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha! Nice catch! But it looks like it's going to be a separate work as per https://github.com/kubernetes-sigs/cluster-api/blob/main/api/core/v1beta2/machinepool_types.go#L32-L65

Copy link
Contributor

@bnallapeta bnallapeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. Looks good.

if !hasMachinePoolMachines {
// Check if infrastructure is ready by looking at the initialization status
if mp.Status.Initialization.InfrastructureProvisioned == nil || !*mp.Status.Initialization.InfrastructureProvisioned {
conditions.Set(mp, metav1.Condition{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its also worth adding a message to state that this pool is "machine less" and the infrastructure is not ready. Just to give more context.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a user-facing condition and so, isn't that a provider-specific implementation detail?
It might be useful for devs debugging the controller itself, but then they can go to logs as opposed to users going to the infrastructure resource when they see the infra not provisioned message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case adding a log with a high verbosity level may help with the debugging then

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Added the log!

@ramessesii2 ramessesii2 force-pushed the RAMESSES/fix-capi-10059 branch from 0952e72 to adee449 Compare February 3, 2026 14:03
@richardcase
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 3, 2026
    - Updated the updateStatus method to include the new condition
    - Added unit tests
    - Fixes 10059

Signed-off-by: Satyam Bhardwaj <sbhardwaj@mirantis.com>
@ramessesii2 ramessesii2 force-pushed the RAMESSES/fix-capi-10059 branch from adee449 to fd0e7eb Compare February 3, 2026 15:05
@bnallapeta
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 5, 2026
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: 942c1eb5fd09168e53e5dddeb3cb8d301e37ab11

@richardcase
Copy link
Member

/lgtm

@ramessesii2
Copy link
Author

Need help getting this merged now that it has lgtm. cc @fabriziopandini

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/machinepool Issues or PRs related to machinepools cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MachinePool observedGeneration is updated without changing conditions on upgrades

4 participants