You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add concurrency limits and refresh jitter to scheduler (#2150)
* feat: add concurrency limits and refresh jitter to scheduler
Add global and per-tenant concurrency semaphores to prevent DB
connection pool exhaustion when schedules align, and stop a single
noisy tenant from consuming all execution slots.
Add configurable jitter to schedule refresh interval to prevent
thundering herd across replicas.
Execution order: global semaphore -> tenant context -> per-tenant
semaphore -> distributed lock -> execute.
New config fields with defaults:
- MaxConcurrentExecutions: 20
- MaxConcurrentPerTenant: 3
- RefreshJitterMax: 0 (opt-in)
* fix: address review feedback on scheduler resilience
- Fix doc comment: RefreshJitterMax default is 0 (disabled), not 10s
- Use context-aware select instead of time.Sleep for jitter to avoid
blocking shutdown
- Move tenant context setup before global semaphore so skipped
execution audit records are properly tenant-scoped
* fix: increase test timeouts for slow CI runners
The concurrency semaphore tests used 5-second timeouts for waiting on
cron ticks, which is too tight on resource-constrained shared runners.
Increased to 15s for initial execution waits and 10s for condition
polling to eliminate flaky failures.
* fix: fix flaky global semaphore test and remove duplicate secondsParser
- Buffer the blocked channel to match MaxConcurrentExecutions so
executor signals are never lost to the default branch
- Increase test timeouts for CI runners (15s wait, 10s await)
- Remove duplicate secondsParser declaration (already in catchup_test.go)
- Remove unused WithCronParser from cron_test.go (only needed for catch-up)
---------
Co-authored-by: Ben Coombs <bjcoombs@users.noreply.github.com>
0 commit comments