Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-2021: Support scaling HPA to/from zero pods for object/external metrics #2022

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

johanneswuerbach
Copy link

@johanneswuerbach johanneswuerbach commented Sep 26, 2020

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 26, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @johanneswuerbach!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 26, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @johanneswuerbach. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. labels Sep 26, 2020
@johanneswuerbach johanneswuerbach changed the title Initial KEP-2021 draft KEP: Support scaling HPA to/from zero pods for object/external metrics Sep 26, 2020
Copy link
Member

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @johanneswuerbach

Enhancements Lead here, some notes on the the current PR:

The dir structure should be: keps/sig-autoscaling/2021-scale-from-zero/ (Note we put the issue number in the title now)

The current file 20200926-scale-from-zero.md should be renamed README.md and a kep.yaml should also be added.

More detail can be found here: https://github.com/kubernetes/enhancements/tree/master/keps/NNNN-kep-template

Best,
Kirsten

@johanneswuerbach
Copy link
Author

@kikisdeliveryservice thank you, folder structure fixed.

@johanneswuerbach
Copy link
Author

johanneswuerbach commented Sep 30, 2020

/assign @gjtempleton

I hoped to get the current state documented first, before we start talking about how we could move towards beta. Does that make sense?

@gjtempleton
Copy link
Member

Hey @johanneswuerbach

I think that's a good plan for now.

@jeffreybrowning
Copy link

@johanneswuerbach @gjtempleton how's this going so far? Can I be of any help?

@johanneswuerbach
Copy link
Author

@jeffreybrowning yes. I assumed we could merge this KEP already as is to document the current state and then iterate on it towards beta. I'll try to present it at the next sig-autoscaling meeting to get some input and discuss next steps, but if you have any input I'm happy to incorporate it already.

@jeffreybrowning
Copy link

@johanneswuerbach missed autoscaling meeting today -- do you have clarity on next steps?

@johanneswuerbach
Copy link
Author

Me too, I assumed those are bi-weekly or are the meetings on-demand?

@jeffreybrowning
Copy link

In all honesty, it would have been my first one -- the work you started on this feature for beta has encouraged me to get involved and help you push this through.

It will really help using HPA + async job queue to scale down to 0 workers when not processing tasks.

@gjtempleton
Copy link
Member

Hey, the meetings are held weekly at 14:00 UTC, you can see more info including a link to the agenda if you want to raise it here.

I've raised this at a previous meeting, so the community's already aware this work has started, but it would be good to have some more in-depth discussion.

@johanneswuerbach
Copy link
Author

Thanks I got confused by the mention of biweekly here https://github.com/kubernetes/community/tree/master/sig-autoscaling#meetings and assumed the calendar invite is wrong. Will ask upfront next time :-)

@jeffreybrowning
Copy link

Pinging here. How's this enhancement coming along? What are next steps?

@jeffreybrowning
Copy link

Pinging back before the holidays hit.

What are next steps?

@johanneswuerbach
Copy link
Author

@adrianmoisey not actively, I've tried to get this KEP reviewed, but there has been much activity. Happy to help with the implementation once there is movement here though.

@adrianmoisey
Copy link
Member

@adrianmoisey not actively, I've tried to get this KEP reviewed, but there has been much activity. Happy to help with the implementation once there is movement here though.

Well... I'm not an approver, but I'm happy to help push this as much as I can.
Would it be possible to update the PR a bit, since it's quite out of date.

Something I'm not 100% sure about, is that the feature seems to already exist, despite the KEP PR not being merged. But I'll do some digging and figure out the state of the world

@adrianmoisey
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 16, 2025
@adrianmoisey
Copy link
Member

@johanneswuerbach can you make the suggested changes, then I'll ping some folks to try un-stuck this KEP

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: johanneswuerbach, josephburnett, mwielgus
Once this PR has been reviewed and has the lgtm label, please ask for approval from johnbelamaric. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@johanneswuerbach
Copy link
Author

@adrianmoisey thanks, updated.

@adrianmoisey
Copy link
Member

@johanneswuerbach Sorry for the runaround. It's taking me a bit of time to get my head around where this KEP is in terms of process.

So... it's already implemented as alpha, and we need to promote it to beta.
There are a few sections in the KEP that are marked as required for beta, so, do you mind filling those in?

Again, sorry for the runaround!

And just to summarise what I think needs happening and the next steps:

  1. Update KEP to beta
  2. Get approval from sig-autoscaling chair
  3. Get approval for PRR

@johanneswuerbach
Copy link
Author

@adrianmoisey I've updated the KEP template, are you talking about any specific section? Would be great reach out to sig-autoscaling again prior to iterating on this to understand what is exactly missing at this point.

@adrianmoisey
Copy link
Member

From my standpoint it seems fine. @gjtempleton will also take a look

Copy link
Member

@gjtempleton gjtempleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits and typos, but some areas where we need to add some detail.

## Summary

[Horizontal Pod Autoscaler][] (HPA) automatically scales the number of pods in any resource which supports the `scale` subresource based on observed CPU utilization
(or, with custom metrics support, on some other application-provided metrics) from one to many replicas. This proposal adds support for scaling from zero to many replicas and back to zero for object and external metrics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to make this a bit more clear on why users would want to do this, and the limitations on these metrics.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, better?

Comment on lines 195 to 196
In cases of a frequently idle queue or a less latency sensitive workload there is no need to run one replica at all times and instead you want to dynamically scale
to zero replicas, especially if those replicas have high resource requests. If replicas are scale to 0, HPA also needs the ability to scale up once messages are available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth making extra-clear the cost/energy saving potential brought about with increasing use of GPU workloads?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as well.

Not required until feature graduated to beta.
- Testing: Are there any tests for failure mode? If not, describe why.
-->

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the failure mode (albeit the same as currently) where the controller is unable to fetch the relevant object or external metrics being used for scale from zero?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

-->

###### What steps should be taken if SLOs are not being met to determine the problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@johanneswuerbach
Copy link
Author

Thanks for the review @gjtempleton , I'll look into your questions until end of week.


- [x] Feature gate (also fill in values in `kep.yaml`)
- Feature gate name: `HPAScaleToZero`
- Components depending on the feature gate: `kube-apiserver`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kube-controller-manager? Where does HPA controller live?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the gate HPAScaleToZero is only required on the kube-apiserver

- [x] Feature gate (also fill in values in `kep.yaml`)
- Feature gate name: `HPAScaleToZero`
- Components depending on the feature gate: `kube-apiserver`
- [x] Other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you need to fill this section, the feature gate is sufficient

-->


Yes. To downgrade the cluster to version that does not support scale-to-zero feature:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or if you turn off the feature gate?


###### What happens if we reenable the feature if it was previously rolled back?

Nothing, the feature can be re-enabled without problems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example, will "stuck at 0" workloads start getting scaled again?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

will rollout across nodes.
-->

There are no expected side-effects when the rollout fails as new `enableScaleToZero` flag should only be enabled once the version upgraded completed and should be disabled before attempting a rollback.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new code in the hpa controller crashes, will it block other HPAs that don't use the feature?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely certain about the details of panic handling in k8s controllers, but the HPA controller implements essentially a queue with multiple workers, so if one of those workers consistently crashes, other HPAs would be affected as well unless the default k8s queue implementation somehow protects against this.

are missing a bunch of machinery and tooling and can't do that now.
-->

No yet as no implementation is available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you have an alpha implementation?

logs or events for this purpose.
-->

The feature is used if workloads are scaled to zero by the autoscaling controller.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have 50,000 clusters, a one liner is not really helpful. Metrics are best. Can kube state metrics help?

If it's not reasonable/practical to have a metric, then other means can be OK.

implementation difficulties, etc.).
-->

No, in regards to this KEP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments above...

Would it be useful to track how often scaling to/from zero happens?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of replicas for each workload that can be targeted by a HPA is already a metric, so tracking how often a workload is scaled to 0 works similar to observing scaling activity today. Is this sufficient?

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 13, 2025
@johanneswuerbach
Copy link
Author

Thanks for the feedback, pushed some updates.

Comment on lines +258 to +264
Currently disabling HPA is possible by manually setting the scaled resource to `replicas: 0`. This works as the HPA itself could never reach this state itself.
As `replicas: 0` is now a possible state when using `minReplicas: 0` it can no longer be used to differentiate between manually disabled or automatically scaled to zero.

Additionally the `replicas: 0` state is problematic as updating a HPA object `minReplicas` from `0` to `1` has different behavior. If `replicas` was `0` during the update, HPA
will be disabled for the resource, if it was `> 0`, HPA will continue with the new `minReplicas` value.

To resolve this issue the KEP is introducing an explicit `enableScaleToZero` property to explicitly enable/disable scale from/to zero.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of having a enableScaleToZero property, we allow HPAs to scale to 0 if they are for objects or external metrics, and introduce a disabled flag, allowing someone to disable the HPA completely?

In the world with enableScaleToZero enabled, how would someone disable that HPA?

Copy link

@tcolgate tcolgate Feb 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of having a enableScaleToZero property, we allow HPAs to scale to 0 if they are for objects or external metrics, and introduce a disabled flag, allowing someone to disable the HPA completely?

In the world with enableScaleToZero enabled, how would someone disable that HPA?

Is it even true that 0 disables? Last time I tested, 0 was rejected as invalid, there was no option for disabling (I had to delete HPAs to "temporarily disabled auto scaling"

(I'm not in a position to test this right now).

FWIW, I think an explicit "allow scaling to zero" flag is a better option anyway. And an explicit disable flag would be very useful regardless of scale to zero

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even true that 0 disables? Last time I tested, 0 was rejected as invalid, there was no option for disabling (I had to delete HPAs to "temporarily disabled auto scaling"

On which resource specifically?

The disabling of the HPA is when the scaled resource's replicas is set to zero, see:

FWIW, I think an explicit "allow scaling to zero" flag is a better option anyway.

Why so? Isn't setting minReplicas: 0 an explicit way to say "I'm Ok with scaling down to zero pods".
We can also safe-guard it by only allowing objects or external metrics.

replaces:

# The target maturity stage in the current dev cycle for this KEP.
stage: beta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may need to stay as alpha, since enableScaleToZero is being introduced.
See: https://github.com/kubernetes/community/blob/ecfd8bbf586d7bcd9767171c4055f2cf6502bb85/contributors/devel/sig-architecture/feature-gates.md?plain=1#L80-L81

And:
https://github.com/kubernetes/community/blob/ecfd8bbf586d7bcd9767171c4055f2cf6502bb85/contributors/devel/sig-architecture/feature-gates.md?plain=1#L137-L150

The scenario I'm worried about is that someone uses enableScaleToZero in 1.33 (I know the feature missed 1.33, this is just an example) when the feature gate is set to beta, and enabled by default.

They then downgrade to 1.32, and now the API server fails on HPA resources with enableScaleToZero set.

If enableScaleToZero had been added in 1.32 with the alpha feature gate defaulted to off, this wouldn't have been a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.