Skip to content

backlog size aware throttling #209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pezra
Copy link
Contributor

@pezra pezra commented Mar 20, 2025

This PR changes the retry_in calculation for concurrency throttled jobs using the :schedule strategy. The approach is to estimate when a throttled job will likely be runnable and schedule it for that time. This estimate is based on the current backlog size for the job group (estimated), the expected job runtime and the concurrency limit.

This should reduce/eliminate unnecessary delays in job processing caused by the current "ignore throttled jobs for TTL seconds" approach while retaining the benefits of the schedule strategy.

PS: I changed the appraisals file because install_if does all of the bundling work regardless of whether the condition evaluates to true or false. In fact, it fetches the packages list from gems.contribsys.com before it even checks the condition. If BUNDLE_GEMS__CONTRIBSYS__COM isn't set that fetch failed with an auth error.

install_if does all of the bundling work even if the condition is
false. That means it fetches the packages list from
gems.contribsys.com before it even checks the condition. If
BUNDLE_GEMS__CONTRIBSYS__COM isn't set it that fetch fails with an
auth error.
Copy link

codecov bot commented Mar 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.38%. Comparing base (9483b5a) to head (4c40c8e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #209      +/-   ##
==========================================
+ Coverage   98.30%   98.38%   +0.07%     
==========================================
  Files          19       19              
  Lines         473      494      +21     
  Branches       79       86       +7     
==========================================
+ Hits          465      486      +21     
  Misses          8        8              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This avoids unnecessary processing delays and smoothes the load on
sidekiq workers.
@pezra pezra force-pushed the queue-depth-aware-throttling branch from 880a11e to df35ceb Compare March 20, 2025 22:17
@pezra pezra changed the title Queue depth aware throttling backlog size aware throttling Mar 20, 2025
@pezra
Copy link
Contributor Author

pezra commented Apr 4, 2025

@ixti any chance we could get this into a release sometime soon? My team as been using is for a while now, and it solves some serious operational issues we were experiencing. I'm sure other would benefit from it too.

@ixti
Copy link
Owner

ixti commented Apr 7, 2025

Sorry, missed this one. Will handle the PR this week.

@pezra
Copy link
Contributor Author

pezra commented Apr 16, 2025

@ixti bump 😄

Copy link
Owner

@ixti ixti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow, I just noticed that I never pressed "submit review" :((

I'm about to release v2.0.0 with drop of Sidekiq-7 and added Sidekiq-8 support. Please rebase on top of main branch after that, and I'll make sure to get to it this week.

old_size = (old_size_str || 0).to_f
old_timestamp = (old_timestamp_str || Time.now).to_f

nonneg(old_size - jobs_lost_since(old_timestamp, job_args))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's better to use clamp than add new method:

Suggested change
nonneg(old_size - jobs_lost_since(old_timestamp, job_args))
(old_size - jobs_lost_since(old_timestamp, job_args)).clamp(0, nil)


Sidekiq.redis { |redis| 1 == SCRIPT.call(redis, keys: keys, argv: argv) }
Sidekiq.redis { |redis| (1 == SCRIPT.call(redis, keys: keys, argv: argv)) }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parenthesis are redundant here

Suggested change
Sidekiq.redis { |redis| (1 == SCRIPT.call(redis, keys: keys, argv: argv)) }
Sidekiq.redis { |redis| 1 == SCRIPT.call(redis, keys: keys, argv: argv) }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants