-
Notifications
You must be signed in to change notification settings - Fork 44
Description
TL;DR
When using Gunicorn with preload_app = True
, setup
must be called in Gunicorn's post_fork
callback. Otherwise, events pushed into Client.queue
will never been seen by Consumer
and won't be sent to Posthog.
Gunicorn
Gunicorn is a commonly used web server for Python projects that uses a pre-fork worker model to achieve concurrency. One of Gunicorn's settings, preload_app = True
, tells Gunicorn to load application code before forking worker processes.
From the Gunicorn docs:
preload_app
Command line: --preload
Default: False
Load application code before the worker processes are forked.
By preloading an application you can save some RAM resources as well as speed up server boot times. Although, if you defer application loading to each worker process, you can reload your application code easily by restarting workers.
Python uses a copy-on-write model when forking processes, so this setting can significantly reduce memory usage in certain scenarios.
posthog-python
NOTE: We're going to get into how fork()
works in Python, which isn't an area of expertise for me, so please forgive me if anything here is wrong.
Client.capture
calls Client._enqueue
which calls self.queue.put
. Client.queue
is an instance of queue.Queue
, which is a thread-safe queue provided by the Python standard library.
When a Client
instance is created before Gunicorn forks its worker processes, the fork()
call shares the memory of the Client
instance, including Client.queue
, between the master and worker processes. When a worker process writes to Client.queue
, Python forks the memory and the worker process gets its own copy that contains the event that was added to the queue. That event does not get added to Client.queue
in the master process.
Client.__init__
creates zero or more Consumer
instances and stores them in Client.consumers
. Consumer.start
is called for each instance and no other changes are made to Consumer
until Client.shutdown
is called. If Client.__init__
is called before Gunicorn forks and no writes to Client.consumers
happen after the fork, then Client.consumers
will share memory across the master and worker processes, which means Consumer.queue
will always be the queue.Queue
instance created in the master process. But because writing to Client.queue
causes Python to create a new copy of the queue for worker processes, the queue.Queue
instance created in the master process will always be empty. No events will ever be sent to Posthog.
Solutions
I have a few ideas for potential solutions, with varying degrees of confidence:
-
Initialize the default client by calling
setup()
inside Gunicorn'spost_fork
callbackThis ensures that every worker process gets its own set of
Consumer
instances and everything behaves as expected. You could also usepost_fork
to initialize a customClient
instance, but I haven't tried that myself. -
Replace
preload_app = True
in Gunicorn withpreload_app = False
preload_app = True
is useful in certain situations, but isn't necessary in every situation. As an example, as far as I can tell,PostHog/posthog
uses Gunicorn but doesn't usepreload_app = True
. The cloud deployment may be different though.This is definitely the easiest option in situations where it works, but I don't like it as a solution because nobody wants to waste RAM.
-
Replacing
queue.Queue
withmultiprocessing.JoinableQueue
multiprocessing.JoinableQueue
is a process-safe version ofqueue.Queue
that uses pipes to communicate between processes. In the spirit of full disclosure, I haven't tried this change myself so I'm not sure it will work. This change may replace one set of problems with a different set of problems and I recommend reading Pipes and Queues to get an idea of what that might look like.One specific thing of note is this warning from "Pipes and Queues":
Warning If a process is killed using Process.terminate() or os.kill() while it is trying to use a Queue, then the data in the queue is likely to become corrupted. This may cause any other process to get an exception when it tries to use the queue later on.
This is particularly relevant to this discussion because Gunicorn has ways of killing workers if it thinks they aren't working correctly or after a certain number of requests as specified by the max_requests and max_requests_jitter settings. We can get around this by using
multiprocessing.Manager
, but that's how we get to the idea of maybe this is just replacing one set of problems with another set of problems.