-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry if driver throws an JobQueueDriverError connectionError #77
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #77 +/- ##
==========================================
- Coverage 92.20% 91.75% -0.45%
==========================================
Files 23 24 +1
Lines 1296 1347 +51
==========================================
+ Hits 1195 1236 +41
- Misses 101 111 +10 ☔ View full report in Codecov by Sentry. |
return try await operation() | ||
} catch let error as JobQueueDriverError where error.code == .connectionError { | ||
logger.debug("\(message()) failed") | ||
if self.options.driverRetryStrategy.shouldRetry(attempt: attempt, error: error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this call still be made since the default maxAttempt is set to the maximum int value? We can have a maximum of two states here where a job was popped off a queue and we loose connection to the driver and will retry until connected or the job lost connection while polling.
For the first case, I am wondering if we should have a background running that finds jobs with states 'processing' that do not exist in a queue? Or should we by default move jobs with such state to their specific queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we hit the retry limit the error is propagated further up and the job queue handler exits and we'll have to restart the queue process to continue processing jobs. The default is set to .max
as the alternative is exiting the process.
If the default is set to a lower number and we exit the handler then the cleanup at start can fixup any jobs left in the processing
state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we hit the retry limit the error is propagated further up and the job queue handler exits and we'll have to restart the queue process to continue processing jobs. The default is set to
.max
as the alternative is exiting the process.If the default is set to a lower number and we exit the handler then the cleanup at start can fixup any jobs left in the
processing
state.
By default all the drivers are setup to do nothing on boot. I think this should be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of documentation to add. We have made a lot of changes since the last release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of documentation to add. We have made a lot of changes since the last release
Indeed! I will help with documents too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I forgot to mention this earlier. How will this work with the Postgres driver? PostgresNIO seems to keep on retrying after a connection lost. I am that familiar with the Redis driver, I suppose it'll be same since the connection pool logic seems very similar between the two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah PostgresNIO will retry connections ad-infinitum. So in theory it isn't an issue when using the Postgres driver.
Redis is different in that it will eventually throw an error and has different errors for when an open connection was closed and when a connection couldn't be made.
Without this change the error would be propagated up and end the job queue handler and eventually the application.
We could move the retry to the drivers instead. I'm already asking the drivers to recognise connection errors.
I'm going to put this on hold, while I think about it. I might push this functionality down to the drivers where needed |
withExponentialBackoff
which retries an operation with exponential backoff if it throws aJobQueueDriverError
with code set to.connectionError
.withExponentialBackoff
JobQueueOptions