Skip to content

PR#9357 Backport for 1.35: Ensure HasInstance and NodeGroupForNode derive from the same list, backport ignore string case#9385

Open
jlamillan wants to merge 4 commits intokubernetes:cluster-autoscaler-release-1.35from
jlamillan:jlamillan/cluster-autoscaler-release-1.35
Open

PR#9357 Backport for 1.35: Ensure HasInstance and NodeGroupForNode derive from the same list, backport ignore string case#9385
jlamillan wants to merge 4 commits intokubernetes:cluster-autoscaler-release-1.35from
jlamillan:jlamillan/cluster-autoscaler-release-1.35

Conversation

@jlamillan
Copy link
Copy Markdown
Contributor

@jlamillan jlamillan commented Mar 18, 2026

Which component this PR applies to?

cluster-autoscaler (OCI provider code)

What type of PR is this?

/kind bug
/kind cleanup

What this PR does / why we need it:

  • Backport of PR 9357 to 1.35
  • Backport of related PR 9250 to 1.35

This PR adds a fix to HasInstance() to ensure that it and NodeGroupForNode() derive their instance lists the same way for consistent lifecycle mapping. It also adds a number of other incremental updates and other improvements to the OCI (Instance-Pool) cloud provider after attempting to debug the issue:

  • updated HasInstance() implementation to more closely align with and NodeGroupForNode() to avoid potential mismatches for newly scaled nodes.
  • added log messages and add debug statements for easier troubleshooting at higher log levels.
  • avoid unnecessary searches for OKE nodes in instance pools.
  • Pulls in recent commit 1a7e964, which may actually address at least part of the root cause

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

I am the original author of the provider code in question and listed in the OWNERS file.

Does this PR introduce a user-facing change?

Improved logging for and other incremental improvements. 

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area labels Mar 18, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlamillan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler area/provider/oci Issues or PRs related to oci provider approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/needs-area labels Mar 18, 2026
@jlamillan
Copy link
Copy Markdown
Contributor Author

/retest

jlamillan and others added 4 commits April 9, 2026 15:18
This fixes an issue where existing nodes would be categorized as upcoming. When the cluster state was updated and HasInstance was called, this function would be invoked to check the instance state. The OCI API for instance pools actually returns "Running" (instead of the expected "RUNNING"); this would cause the instance to be flagged as 'Deleted' in the readiness state, and when the calculation for newNodes was made, since a running node belonging to the pool instance was counted as deleted and not registered, the upcoming node count was incorrectly non-zero.
@jlamillan jlamillan force-pushed the jlamillan/cluster-autoscaler-release-1.35 branch from f5cbbbc to 0a0f532 Compare April 9, 2026 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler area/provider/oci Issues or PRs related to oci provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants