Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix task queue rate-limit for invalid tasks #7590

Merged
merged 12 commits into from
Apr 11, 2025
Merged

Conversation

carlydf
Copy link
Contributor

@carlydf carlydf commented Apr 8, 2025

What changed?

Implement RecycleToken in MultiRateLimiter and add functional tests.

Why?

ClockedRateLimiter is the base rate limiter for many rate limiters in our system, so I made the mistake of assuming that implementing RecycleToken there and and then calling ClockedRateLimiter.RecycleToken from other wrapper rate limiters (such as MultiRateLimiter and DynamicRateLimiter) would "just work."

However, a limitation of the recycle token implementation is that another process must be waiting on a token (via ClockedRateLimiter.Wait) at the time when RecycleToken is called.

DynamicRateLimiter.Wait calls ClockedRateLimiter.Wait, which means that DynamicRateLimiter.RecycleToken can simply call ClockedRateLimiter.RecycleToken.

However, MultiRateLimiter.Wait does not call the Wait method of its sub-rate-limiters; instead it implements a custom Wait method using the Reserve methods of its sub-rate-limiters. This means the sub-rate-limiters never call Wait and never wait for the recycled token to arrive. Giving MultiRateLimiter a custom RecycleToken implementation that pairs with the custom MultiRateLimiter.Wait solves this problem.

How did you test it?

Tested in a test server using sample workflows, and with functional tests for old and new matcher.
Note that New Matcher did not have this bug, because it does not use MultiRateLimiter for task queue rate limiting.

Potential risks

Documentation

Is hotfix candidate?

It fixes a bug, but the bug only occurs when many workflows that have backlogged tasks become invalid (ie. via termination or deletion).

@carlydf carlydf requested a review from a team as a code owner April 8, 2025 02:01
@carlydf carlydf marked this pull request as draft April 8, 2025 02:01
@carlydf carlydf marked this pull request as ready for review April 8, 2025 04:46
@carlydf carlydf merged commit 5a1a3da into main Apr 11, 2025
50 checks passed
@carlydf carlydf deleted the cdf/rate-limit-recycle-fix branch April 11, 2025 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants