Skip to content

Jobs stuck in Enqueued state when Postgres Listener connection is disrupted using EnableLongPolling #410

@ian-a-anderson

Description

@ian-a-anderson

Version Info

Hangfire.NetCore version: 1.8.21
Hangfire.Postgresql version: 1.20.12

Issue Details

Earlier this year we switched to using PostgreSqlStorageOptions.EnableLongPolling = true for our Hangfire configuration and that has been an awesome performance update. However, we have noticed in our non-prod environments that if the underlying connection that PostgreSqlJobQueue opens via ListenForNotificationsAsync is disrupted, this causes the subsequent enqueue of jobs to become stuck in an Enqueued state until their invisibility timeout passes and they are re-processed. Steps to repro:

  1. Configure Hangfire.Postgresql with PostgreSqlStorageOptions.EnableLongPolling = true;
  2. Configure a server and job queue for processing (EG: email)
  3. Enqueue a job and note that it processes immediately (thanks to the polling/listener notification mechanism)
  4. Terminate the listener connection(s) via pgadmin via:
SELECT pg_terminate_backend(pid) 
FROM pg_stat_activity 
WHERE pid <> pg_backend_pid() AND query like 'LISTEN new_job'
  1. Note that the listener connection(s) do not automatically reconnect after this disconnect:
SELECT * 
FROM pg_stat_activity 
WHERE pid <> pg_backend_pid() AND query like 'LISTEN new_job'
  1. Enqueue another job (either via code or via Hangfire.Console) and note that it sticks in an Enqueued state:
Image

Question

Would it be possible to add some graceful reconnection capability to the job queue listener connections in this scenario to try to avoid the stuck jobs? I can't quite tell why, but over a longer period of time it appears that listener connections do start to connect up again, seemingly as a result of subsequent job Enqueue activity. But it doesn't seem like a reliable enough recovery mechanism to avoid the stuck jobs so we are looking for anything that could smooth out this experience.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions