Description
Description
This is really the same issue that I raised years back with Appsignal: appsignal/appsignal-ruby#526
The concurrent rate limiters in sidekiq-enterprise
raise Sidekiq::OverLimit
when a lock cannot be acquired within a certain amount of time. (See https://github.com/sidekiq/sidekiq/wiki/Ent-Rate-Limiting#concurrent for details.) Sidekq's own middleware rescues this exception and re-enqueues the job to be run later -- unless the job has exceeded its maximum number of retries, in which case the OverLimit
exception is allowed to escape the middleware.
When Sidekiq re-enqueues the job in response to an OverLimit
, this isn't an error, and we'd prefer it not show up as one in newrelic. On the other hand, if the exception escapes Sidekiq's middleware (i.e., the number of retries has been exceeded), we'd like to see it.
As with appsignal, the behavior would seem to be a function of the order of sidekiq's middleware vis-a-vis newrelic's. We could reorder the middleware chain on our end, but it would be a good idea for newrelic to ensure that its middleware sits earlier in the chain than sidekiq's own.
Expected Behavior
When Sidekiq::OverLimit
is raised by a rate limiter, unless Sidekiq's own middleware re-raises the exception, it should not show up in newrelic as an error.
Steps to Reproduce
To reproduce, you need to create contention for a sidekiq concurrent rate limiter. You could create a concurrent Sidekiq::Limiter
with concurrency 1 (i.e., a mutex) and a short wait_timeout
. Have two processes/threads run concurrently, trying to acquire the lock and, with it, sleep for longer than the timeout. The one that loses the race to acquire the lock should raise Sidekiq::OverLimit
.
Your Environment
newrelic_rpm 9.16.1
ruby 3.3.6
rails 8.0.1
sidekiq-ent 7.3.4
For Maintainers Only or Hero Triaging this bug
Suggested Priority (P1,P2,P3,P4,P5):
Suggested T-Shirt size (S, M, L, XL, Unknown):
Metadata
Assignees
Type
Projects
Status
In Sprint