Skip to content

feat: support scaling direction-aware cooldown for task auto-scalers#19286

Merged
jtuglu1 merged 4 commits intoapache:masterfrom
jtuglu1:support-direction-aware-cooldown-for-autoscaler
Apr 23, 2026
Merged

feat: support scaling direction-aware cooldown for task auto-scalers#19286
jtuglu1 merged 4 commits intoapache:masterfrom
jtuglu1:support-direction-aware-cooldown-for-autoscaler

Conversation

@jtuglu1
Copy link
Copy Markdown
Contributor

@jtuglu1 jtuglu1 commented Apr 9, 2026

Description

Adds support for configuring different cooldowns for scaling direction. While both scaling actions do cause temporary disruption to ingestion, scaling down can cause more disruption than scaling up due to having less resources than when you started to recover from lag. Therefore, to allow for aggressive scale up while having a more conservative scale-down approach, this adds configuration for cool down period for both directions. Cool down for a specific scaling direction is evaluated as follows:

  1. minScaleUpDelay/minScaleDownDelay
  2. minTriggerScaleActionFrequencyMillis (marked as deprecated)
  3. The default of 600000.

This also does the following:

  1. Cleans up the core scaling logic to be more clear/readable
  2. Adds a custom ScaleActionSupplier which makes the contract clear for auto-scalers who want to implement it.
  3. Fixes a few notification bugs, namely:
  • Emitting/logging metrics with skip reasons irrespective of whether we are actually planning to scale or not (e.g. auto-scaler function returns value != -1, but that value is equal to the current task count). Now, we emit a required task count metric with a specific skip reason.

Release note

Adds support for configuring different cooldowns for scaling direction for streaming task auto-scalers.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from c72942f to fc9c5cb Compare April 9, 2026 18:16
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch 2 times, most recently from 52e327f to 164f891 Compare April 9, 2026 19:24
@jtuglu1 jtuglu1 marked this pull request as ready for review April 9, 2026 19:30
@jtuglu1 jtuglu1 requested a review from Fly-Style April 9, 2026 20:19
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from 164f891 to 7fd9d61 Compare April 10, 2026 00:38
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch 2 times, most recently from 6fe7359 to ad264b0 Compare April 10, 2026 02:01
@jtuglu1 jtuglu1 requested a review from kfaraz April 10, 2026 02:04
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from ad264b0 to 757cbde Compare April 10, 2026 02:06
@jtuglu1 jtuglu1 marked this pull request as draft April 10, 2026 03:42
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from 757cbde to 96f9f68 Compare April 10, 2026 16:44
@jtuglu1 jtuglu1 requested a review from gianm April 10, 2026 16:52
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from 96f9f68 to 8f4bfc5 Compare April 10, 2026 17:23
@jtuglu1 jtuglu1 marked this pull request as ready for review April 10, 2026 23:06
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from 8f4bfc5 to cadba38 Compare April 10, 2026 23:20
@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Apr 11, 2026

@jtuglu1 , cost-based auto scaler already uses a minScaleDownDelay config for similar purposes. I feel we should simply expose that at the AutoScalerConfig instead of two new min trigger fields. (I never liked the name minTriggerScaleActionFrequencyMillis as it's verbose and still ambiguous.)

The functionality of the minScaleDownDelay may be modified as required by this PR.

cc: @Fly-Style

@jtuglu1
Copy link
Copy Markdown
Contributor Author

jtuglu1 commented Apr 11, 2026

@jtuglu1 , cost-based auto scaler already uses a minScaleDownDelay config for similar purposes. I feel we should simply expose that at the AutoScalerConfig instead of two new min trigger fields. (I never liked the name minTriggerScaleActionFrequencyMillis as it's verbose and still ambiguous.)

The functionality of the minScaleDownDelay may be modified as required by this PR.

cc: @Fly-Style

Sure, I always want to reduce config bloat. However, I'd prefer to settle on a conventional naming scheme if possible (e.g. minScaleUpDelay) to keep kicking the ambiguous can that is minTriggerScaleActionFrequencyMillis down the road. This way, we can deprecate that config and instead just treat it as an alias for both if unspecified in future releases.

@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch 2 times, most recently from c48f082 to 752b235 Compare April 13, 2026 17:29
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch 3 times, most recently from 6d68660 to a9d469f Compare April 13, 2026 19:50
@jtuglu1 jtuglu1 requested a review from abhishekrb19 April 13, 2026 21:38
Copy link
Copy Markdown
Contributor

@Fly-Style Fly-Style left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! The patch has nice direction, some polishing and we will be glad to merge it.

@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch 3 times, most recently from 008d2f6 to 213d73d Compare April 16, 2026 05:45
@jtuglu1 jtuglu1 requested a review from Fly-Style April 16, 2026 05:51
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch 3 times, most recently from 4db2965 to 68baa70 Compare April 21, 2026 08:19
@jtuglu1 jtuglu1 marked this pull request as draft April 21, 2026 08:25
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from 68baa70 to 50e1c9b Compare April 22, 2026 08:09
@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from 1ff1d4b to 21579d8 Compare April 22, 2026 21:36
@jtuglu1 jtuglu1 marked this pull request as ready for review April 22, 2026 22:43
@jtuglu1 jtuglu1 requested a review from Fly-Style April 23, 2026 06:11
Copy link
Copy Markdown
Contributor

@Fly-Style Fly-Style left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like the improvement, clear differentiation between scale-up and scale-down, while preserving old behaviour as a fallback. LGTM!

@jtuglu1 jtuglu1 force-pushed the support-direction-aware-cooldown-for-autoscaler branch from ee8ccc6 to 612e20f Compare April 23, 2026 16:02
@jtuglu1 jtuglu1 merged commit 58693ed into apache:master Apr 23, 2026
61 of 63 checks passed
@github-actions github-actions Bot added this to the 38.0.0 milestone Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants