Revert "Implemented HasInstance for GCE provider" as it is causing CA E2E test failures by Choraden · Pull Request #9470 · kubernetes/autoscaler

Choraden · 2026-04-09T12:06:13Z

What type of PR is this?

/kind failing-test

What this PR does / why we need it:

This reverts commit 785b523.

Since April 3rd both CA presubmits and periodic E2E tests fail:
https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/pr-logs/directory/pull-autoscaling-e2e-gci-gce-ca-test
https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-gci-gce-autoscaling

I have reasons to think this change is causing CA E2E tests failures.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

This reverts commit 785b523.

k8s-ci-robot · 2026-04-09T12:06:19Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2026-04-09T12:06:25Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Choraden
Once this PR has been reviewed and has the lgtm label, please assign towca for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

cluster-autoscaler/cloudprovider/gce/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Choraden · 2026-04-09T12:07:26Z

/test all

Choraden · 2026-04-09T14:26:58Z

The test succeeded. Rerunning to gather more datapoints.
/retest pull-autoscaling-e2e-gci-gce-ca-test

Choraden · 2026-04-09T14:27:21Z

/test pull-autoscaling-e2e-gci-gce-ca-test

jackfrancis · 2026-04-09T21:52:20Z

/test pull-autoscaling-e2e-gci-gce-ca-test

Choraden · 2026-04-10T08:29:04Z

@domenicbozzuto @jbtk @x13n Since April 3rd both CA presubmits and periodic E2E tests fail(/are flaky, but mostly fail):
https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/pr-logs/directory/pull-autoscaling-e2e-gci-gce-ca-test
https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-gci-gce-autoscaling
By bisecting the git log I was able to narrow down the possible culprit to #9319.
To verify it, I reverted that change. Notice that 3 consecutive E2E test runs succeeded.
It looks like you were lucky to have the presubmit succeeded on your PR. This way you were able to merge the change.

TBH I don't really understand how it breaks the CA, but when it comes to test cases, usually there are 3 nodes and we expect the scale up to 5. For some reason CA does 3->5 and in the next loop 5->6.
See the logs:

I0409 14:09:44.709864       1 static_autoscaler.go:296] Starting main loop
I0409 14:09:44.710269       1 static_autoscaler.go:1233] Found 45 pods in the cluster: 43 scheduled, 2 unschedulable, 0 unprocessed by scheduler, 0 ignored by allowed schedulers (most likely using custom scheduler), 0 ignored due to dissallowed schedulers
W0409 14:09:44.816994       1 templates.go:510] no os defined in AUTOSCALER_ENV_VARS; using default linux
W0409 14:09:44.817038       1 templates.go:641] no os-distribution defined in AUTOSCALER_ENV_VARS; using default cos
W0409 14:09:44.817073       1 templates.go:722] no evictionHard defined in AUTOSCALER_ENV_VARS;
W0409 14:09:44.817087       1 templates.go:232] unable to get evictionHardFromKubeEnv values, continuing without it.
I0409 14:09:44.817125       1 gce_reserved.go:143] evictionHard memory tag not found, using default
I0409 14:09:44.817137       1 gce_reserved.go:163] evictionHard ephemeral storage tag not found, using default
W0409 14:09:44.817174       1 templates.go:242] could not extract kube-reserved from kubeEnv for mig "kt2-6a7d0aef-4f19-minion-group", setting allocatable to capacity.
I0409 14:09:44.817465       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-t8mj: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817488       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-t8mj: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817500       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-master: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817513       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-dqn9: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817523       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-dqn9: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817534       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-gwd4: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817542       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-gwd4: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817555       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-96l9: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817564       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-96l9: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817577       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-ktwv: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817587       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-ktwv: using maxNodeStartupTime = 15m0s
I0409 14:09:44.817892       1 filter_out_schedulable.go:65] Filtering out schedulables
I0409 14:09:44.818367       1 taint_toleration.go:130] "node had untolerated taints" logger="Filter.TaintToleration" node="kt2-6a7d0aef-4f19-master" pod="autoscaling-5580/increase-size-pod-m9x7f" untoleratedTaint={"key":"node-role.kubernetes.io/control-plane","effect":"NoSchedule"}
I0409 14:09:44.818447       1 klogx.go:87] failed to find place for autoscaling-5580/increase-size-pod-m9x7f: can't schedule pod autoscaling-5580/increase-size-pod-m9x7f: couldn't find a matching Node with passing predicates
I0409 14:09:44.819029       1 klogx.go:87] failed to find place for autoscaling-5580/increase-size-pod-wmxd9 based on similar pods scheduling
I0409 14:09:44.819053       1 filter_out_schedulable.go:122] 0 pods marked as unschedulable can be scheduled.
I0409 14:09:44.819078       1 filter_out_schedulable.go:85] No schedulable pods
I0409 14:09:44.819139       1 filter_out_daemon_sets.go:47] Filtered out 0 daemon set pods, 2 unschedulable pods left
I0409 14:09:44.819162       1 klogx.go:87] Pod autoscaling-5580/increase-size-pod-m9x7f is unschedulable
I0409 14:09:44.819169       1 klogx.go:87] Pod autoscaling-5580/increase-size-pod-wmxd9 is unschedulable
I0409 14:09:44.819384       1 orchestrator.go:112] Upcoming 0 nodes
I0409 14:09:44.821455       1 waste.go:56] Expanding Node Group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group would waste 100.00% CPU, 100.00% Memory, 100.00% Blended
I0409 14:09:44.821501       1 orchestrator.go:189] Best option to resize: https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group
I0409 14:09:44.821576       1 orchestrator.go:193] Estimated 2 nodes needed in https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group
I0409 14:09:44.821702       1 orchestrator.go:265] Final scale-up plan: [{https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group 3->5 (max: 6)}]
I0409 14:09:44.821999       1 executor.go:164] Scale-up: setting group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group size to 5
I0409 14:09:44.822447       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"abfa659d-6408-4151-acf5-428376bb6cbe", APIVersion:"v1", ResourceVersion:"570473", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: setting group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group size to 5 instead of 3 (max: 6)
I0409 14:09:44.822881       1 mig_info_provider.go:335] Regenerating MIG instances cache for gke-hgrochowski-hosted-master/us-central1-b/kt2-6a7d0aef-4f19-minion-group
I0409 14:09:45.290579       1 autoscaling_gce_client.go:339] Waiting for operation compute.instanceGroupManagers.createInstances/operation-1775743785055-64f0791853399-70244274-ffa28e68 (gke-hgrochowski-hosted-master/us-central1-b)
I0409 14:09:45.726417       1 autoscaling_gce_client.go:346] Operation compute.instanceGroupManagers.createInstances/operation-1775743785055-64f0791853399-70244274-ffa28e68 (gke-hgrochowski-hosted-master/us-central1-b) status: DONE
I0409 14:09:45.727208       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"abfa659d-6408-4151-acf5-428376bb6cbe", APIVersion:"v1", ResourceVersion:"570473", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group size set to 5 instead of 3 (max: 6)
I0409 14:09:45.862217       1 eventing_scale_up_processor.go:47] Skipping event processing for unschedulable pods since there is a ScaleUp attempt this loop
I0409 14:09:45.862756       1 static_autoscaler.go:631] Calculating unneeded nodes
I0409 14:09:45.863793       1 eligibility.go:113] Skipping kt2-6a7d0aef-4f19-minion-group-gwd4 from delete consideration - the node is currently being deleted
I0409 14:09:45.863862       1 eligibility.go:113] Skipping kt2-6a7d0aef-4f19-minion-group-ktwv from delete consideration - the node is currently being deleted
I0409 14:09:45.864178       1 klogx.go:87] Node kt2-6a7d0aef-4f19-minion-group-dqn9 - cpu requested is 9.6% of allocatable
I0409 14:09:45.864704       1 klogx.go:87] Node kt2-6a7d0aef-4f19-minion-group-96l9 - cpu requested is 9.6% of allocatable
I0409 14:09:45.864734       1 eligibility.go:104] Scale-down calculation: ignoring 1 nodes unremovable in the last 1m0s
I0409 14:09:45.864850       1 cluster.go:146] Simulating node kt2-6a7d0aef-4f19-minion-group-dqn9 removal
I0409 14:09:45.865353       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Pod", Namespace:"autoscaling-5580", Name:"increase-size-pod-m9x7f", UID:"08f20056-3184-4c85-a77a-56c8f222ea79", APIVersion:"v1", ResourceVersion:"570467", FieldPath:""}): type: 'Normal' reason: 'TriggeredScaleUp' pod triggered scale-up: [{https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group 3->5 (max: 6)}]
I0409 14:09:45.866182       1 taint_toleration.go:130] "node had untolerated taints" logger="Filter.TaintToleration" node="kt2-6a7d0aef-4f19-master" pod="autoscaling-5580/increase-size-pod-8jfmq" untoleratedTaint={"key":"node-role.kubernetes.io/control-plane","effect":"NoSchedule"}
I0409 14:09:45.869858       1 klogx.go:87] failed to find place for autoscaling-5580/increase-size-pod-8jfmq: can't schedule pod autoscaling-5580/increase-size-pod-8jfmq: couldn't find a matching Node with passing predicates
I0409 14:09:45.870816       1 cluster.go:161] Node kt2-6a7d0aef-4f19-minion-group-dqn9 is not suitable for removal: can reschedule only 0 out of 1 pods
I0409 14:09:45.871231       1 cluster.go:146] Simulating node kt2-6a7d0aef-4f19-minion-group-96l9 removal
I0409 14:09:45.873922       1 taint_toleration.go:130] "node had untolerated taints" logger="Filter.TaintToleration" node="kt2-6a7d0aef-4f19-master" pod="autoscaling-5580/increase-size-pod-qxntv" untoleratedTaint={"key":"node-role.kubernetes.io/control-plane","effect":"NoSchedule"}
I0409 14:09:45.874563       1 klogx.go:87] failed to find place for autoscaling-5580/increase-size-pod-qxntv: can't schedule pod autoscaling-5580/increase-size-pod-qxntv: couldn't find a matching Node with passing predicates
I0409 14:09:45.875002       1 cluster.go:161] Node kt2-6a7d0aef-4f19-minion-group-96l9 is not suitable for removal: can reschedule only 0 out of 1 pods
I0409 14:09:45.875234       1 planner.go:332] 2 nodes found to be unremovable in simulation, will re-check them at 2026-04-09 14:10:44.709841319 +0000 UTC m=+2712.305817519
I0409 14:09:45.876006       1 static_autoscaler.go:674] Scale down status: lastScaleUpTime=2026-04-09 14:09:44.709841319 +0000 UTC m=+2652.305817519 lastScaleDownDeleteTime=2026-04-09 14:09:36.415463385 +0000 UTC m=+2644.011439586 lastScaleDownFailTime=2026-04-09 12:25:33.2226787 +0000 UTC m=-3599.181345073 scaleDownForbidden=false scaleDownInCooldown=true
I0409 14:09:45.882519       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Pod", Namespace:"autoscaling-5580", Name:"increase-size-pod-wmxd9", UID:"55c59228-22b9-4431-bbf3-c20ffd0f7a08", APIVersion:"v1", ResourceVersion:"570470", FieldPath:""}): type: 'Normal' reason: 'TriggeredScaleUp' pod triggered scale-up: [{https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group 3->5 (max: 6)}]
I0409 14:09:45.913638       1 trigger.go:142] Autoscaler loop triggered immediately after a scale up
I0409 14:09:45.913809       1 static_autoscaler.go:296] Starting main loop
I0409 14:09:45.914497       1 static_autoscaler.go:1233] Found 45 pods in the cluster: 43 scheduled, 2 unschedulable, 0 unprocessed by scheduler, 0 ignored by allowed schedulers (most likely using custom scheduler), 0 ignored due to dissallowed schedulers
W0409 14:09:46.038928       1 templates.go:510] no os defined in AUTOSCALER_ENV_VARS; using default linux
W0409 14:09:46.038998       1 templates.go:641] no os-distribution defined in AUTOSCALER_ENV_VARS; using default cos
W0409 14:09:46.039045       1 templates.go:722] no evictionHard defined in AUTOSCALER_ENV_VARS;
W0409 14:09:46.039058       1 templates.go:232] unable to get evictionHardFromKubeEnv values, continuing without it.
I0409 14:09:46.039067       1 gce_reserved.go:143] evictionHard memory tag not found, using default
I0409 14:09:46.039074       1 gce_reserved.go:163] evictionHard ephemeral storage tag not found, using default
W0409 14:09:46.039717       1 templates.go:242] could not extract kube-reserved from kubeEnv for mig "kt2-6a7d0aef-4f19-minion-group", setting allocatable to capacity.
I0409 14:09:46.040078       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-dqn9: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040176       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-dqn9: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040192       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-gwd4: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040202       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-gwd4: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040219       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-96l9: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040228       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-96l9: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040241       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-ktwv: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040250       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-ktwv: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040263       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-t8mj: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040273       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-minion-group-t8mj: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040291       1 clusterstate.go:661] Node kt2-6a7d0aef-4f19-master: using maxNodeStartupTime = 15m0s
I0409 14:09:46.040321       1 clusterstate.go:299] Scale up in group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group finished successfully in 187.290163ms
I0409 14:09:46.040454       1 filter_out_schedulable.go:65] Filtering out schedulables
I0409 14:09:46.042209       1 taint_toleration.go:130] "node had untolerated taints" logger="Filter.TaintToleration" node="kt2-6a7d0aef-4f19-master" pod="autoscaling-5580/increase-size-pod-m9x7f" untoleratedTaint={"key":"node-role.kubernetes.io/control-plane","effect":"NoSchedule"}
I0409 14:09:46.043440       1 klogx.go:87] failed to find place for autoscaling-5580/increase-size-pod-m9x7f: can't schedule pod autoscaling-5580/increase-size-pod-m9x7f: couldn't find a matching Node with passing predicates
I0409 14:09:46.045511       1 klogx.go:87] failed to find place for autoscaling-5580/increase-size-pod-wmxd9 based on similar pods scheduling
I0409 14:09:46.045549       1 filter_out_schedulable.go:122] 0 pods marked as unschedulable can be scheduled.
I0409 14:09:46.045581       1 filter_out_schedulable.go:85] No schedulable pods
I0409 14:09:46.045741       1 filter_out_daemon_sets.go:47] Filtered out 0 daemon set pods, 2 unschedulable pods left
I0409 14:09:46.045772       1 klogx.go:87] Pod autoscaling-5580/increase-size-pod-m9x7f is unschedulable
I0409 14:09:46.048396       1 klogx.go:87] Pod autoscaling-5580/increase-size-pod-wmxd9 is unschedulable
I0409 14:09:46.048580       1 orchestrator.go:112] Upcoming 0 nodes
I0409 14:09:46.051438       1 threshold_based_limiter.go:59] Capping binpacking after exceeding threshold of 1 nodes
I0409 14:09:46.051646       1 waste.go:56] Expanding Node Group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group would waste 100.00% CPU, 100.00% Memory, 100.00% Blended
I0409 14:09:46.051787       1 orchestrator.go:189] Best option to resize: https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group
I0409 14:09:46.051809       1 orchestrator.go:193] Estimated 1 nodes needed in https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group
I0409 14:09:46.051847       1 orchestrator.go:265] Final scale-up plan: [{https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group 5->6 (max: 6)}]
I0409 14:09:46.051878       1 executor.go:164] Scale-up: setting group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group size to 6
I0409 14:09:46.052194       1 mig_info_provider.go:335] Regenerating MIG instances cache for gke-hgrochowski-hosted-master/us-central1-b/kt2-6a7d0aef-4f19-minion-group
I0409 14:09:46.052387       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"abfa659d-6408-4151-acf5-428376bb6cbe", APIVersion:"v1", ResourceVersion:"570480", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: setting group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group size to 6 instead of 5 (max: 6)
I0409 14:09:46.359155       1 autoscaling_gce_client.go:339] Waiting for operation compute.instanceGroupManagers.createInstances/operation-1775743786143-64f079195cdde-9992c48e-beb50d51 (gke-hgrochowski-hosted-master/us-central1-b)
I0409 14:09:47.107997       1 autoscaling_gce_client.go:346] Operation compute.instanceGroupManagers.createInstances/operation-1775743786143-64f079195cdde-9992c48e-beb50d51 (gke-hgrochowski-hosted-master/us-central1-b) status: DONE
I0409 14:09:47.108309       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"abfa659d-6408-4151-acf5-428376bb6cbe", APIVersion:"v1", ResourceVersion:"570480", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: group https://www.googleapis.com/compute/v1/projects/gke-hgrochowski-hosted-master/zones/us-central1-b/instanceGroups/kt2-6a7d0aef-4f19-minion-group size set to 6 instead of 5 (max: 6)
I0409 14:09:47.199437       1 eventing_scale_up_processor.go:47] Skipping event processing for unschedulable pods since there is a ScaleUp attempt this loop
I0409 14:09:47.199533       1 static_autoscaler.go:631] Calculating unneeded nodes
I0409 14:09:47.199942       1 eligibility.go:113] Skipping kt2-6a7d0aef-4f19-minion-group-gwd4 from delete consideration - the node is currently being deleted
I0409 14:09:47.199971       1 eligibility.go:113] Skipping kt2-6a7d0aef-4f19-minion-group-ktwv from delete consideration - the node is currently being deleted
I0409 14:09:47.199985       1 eligibility.go:104] Scale-down calculation: ignoring 3 nodes unremovable in the last 1m0s
I0409 14:09:47.200026       1 static_autoscaler.go:674] Scale down status: lastScaleUpTime=2026-04-09 14:09:45.913791161 +0000 UTC m=+2653.509767361 lastScaleDownDeleteTime=2026-04-09 14:09:36.415463385 +0000 UTC m=+2644.011439586 lastScaleDownFailTime=2026-04-09 12:25:33.2226787 +0000 UTC m=-3599.181345073 scaleDownForbidden=false scaleDownInCooldown=true

jbtk · 2026-04-10T08:46:50Z

Since the tests passed in the end on this PR from what I see in the comments can we run these tests 20 times with and without PR and compare flakiness?

Choraden · 2026-04-10T11:20:28Z

Since the tests passed in the end on this PR from what I see in the comments can we run these tests 20 times with and without PR and compare flakiness?

I believe the presubmit dashboard should provide us enough insights: https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/pr-logs/directory/pull-autoscaling-e2e-gci-gce-ca-test

domenicbozzuto · 2026-04-10T13:27:59Z

👋 Sorry for the noise with this; I found a similar pattern when I added my PR originally (comment) -- the multiple upscales seemed exactly like the reason shouldn't trigger additional scale-ups during processing scale-up was skipped (e2eskipper.Skipf("Test is flaky and disabled for now")), and this predated the HasInstance change.

I'm fine with the revert if it's breaking a lot of tests, I can try to spend some time looking at why it's triggering multiple scale-ups and why that seems more common with this change.

Choraden · 2026-04-10T13:47:58Z

AFAIR shouldn't trigger additional scale-ups during processing scale-up is a different thing. That is related to k8s events about scale up that are emitted by CA in an unreliable way. The behavior introduced by #9319 looks new to me.

jackfrancis · 2026-04-10T17:29:35Z

/test pull-autoscaling-e2e-gci-gce-ca-test

jackfrancis · 2026-04-10T20:42:49Z

/test pull-autoscaling-e2e-gci-gce-ca-test

Revert "Implemented HasInstance for GCE provider"

1886646

This reverts commit 785b523.

k8s-ci-robot requested a review from x13n April 9, 2026 12:06

k8s-ci-robot added the area/cluster-autoscaler label Apr 9, 2026

k8s-ci-robot requested a review from yaroslava-serdiuk April 9, 2026 12:06

k8s-ci-robot added area/provider/gce size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/needs-area labels Apr 9, 2026

Choraden changed the title ~~[Test] Revert "Implemented HasInstance for GCE provider"~~ Revert "Implemented HasInstance for GCE provider" as it is causing CA E2E test failures Apr 10, 2026

Choraden marked this pull request as ready for review April 10, 2026 08:20

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 10, 2026

k8s-ci-robot requested a review from towca April 10, 2026 08:20

Choraden mentioned this pull request Apr 10, 2026

Fix:flaky CA E2E test "shouldn't trigger additional scale-ups" #9465

Open

Conversation

Choraden commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Apr 9, 2026

Uh oh!

k8s-ci-robot commented Apr 9, 2026

Uh oh!

Choraden commented Apr 9, 2026

Uh oh!

Choraden commented Apr 9, 2026

Uh oh!

Choraden commented Apr 9, 2026

Uh oh!

jackfrancis commented Apr 9, 2026

Uh oh!

Choraden commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbtk commented Apr 10, 2026

Uh oh!

Choraden commented Apr 10, 2026

Uh oh!

domenicbozzuto commented Apr 10, 2026

Uh oh!

Choraden commented Apr 10, 2026

Uh oh!

jackfrancis commented Apr 10, 2026

Uh oh!

jackfrancis commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Choraden commented Apr 9, 2026 •

edited

Loading

Choraden commented Apr 10, 2026 •

edited

Loading