Skip to content

Commit 35deb3b

Browse files
mpscholtenclaude
andcommitted
Fix job worker silently exiting on transient fetchNextJob error
After the job worker redesign (820cf00), runJobLoop exits without retrying when fetchNextJob throws a transient error (pool exhaustion, connection timeout). Since the NOTIFY signal was already consumed from the TBQueue, nothing triggers a new worker spawn, so the job sits orphaned until the 60-second poller picks it up. The old MVar-based workers were persistent and always looped back to takeMVar after any outcome. The new on-demand workers are ephemeral, so exiting means the job is lost until the poller runs. Add runJobLoop call to the error branch so the worker retries after the 1-second backoff, matching how the poller handles errors. Fixes amitaibu/ihp-sensors#18 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ba7fa69 commit 35deb3b

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

ihp/IHP/Job/Runner.hs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@ jobWorkerFetchAndRunLoop JobWorkerArgs { .. } = do
171171
Left exception -> do
172172
Log.error ("Job worker: Failed to fetch next job: " <> tshow exception)
173173
Concurrent.threadDelay 1000000 -- 1s backoff to avoid tight error loops
174+
runJobLoop -- retry after transient error
174175
Right (Just job) -> do
175176
Log.info ("Starting job: " <> tshow job)
176177

0 commit comments

Comments
 (0)