Prometheus middleware dramatiq_delayed_messages_inprogress not tracking completed messages correctly #672
Description
Issues
The prometheus middleware exports a gauge metric dramatiq_delayed_messages_inprogress
that is documented to track "The number of delayed messages in memory.". I noticed something appeared to be wrong with this metric because in my system the metric never goes down.
My setup:
dramatiq: 1.15.0
broker: redis
workers: 4
processes per worker: 2
threads per process: 1
Looking over the code I can see that the middleware is listening to the before_delay_message
and before_process_message
to track when a message is first delayed and then complete by tracking the message IDs. Looking at the actual worker implementation, however, it looks like messages get re-queued back in the broker when their eta arrives instead of being processed directly on the same worker process. (I also wonder if the message retains its same message ID upon being requeued, unsure)
All this seems to create a situation where the process that first marked a message as in progress to be unlikely to be the process that observes its completion, resulting in the metric not being tracked correctly. Additionally this whole process appears to leak memory (albeit slowly and not much) as the delayed message ID set grows over time.
Would be happy to contribute/discuss a patch to fix this metric if desired!