-
Notifications
You must be signed in to change notification settings - Fork 767
Description
I’ve identified a memory safety issue when using enqueue=True with sinks that cannot keep up with the log volume (e.g., slow disk I/O, network sinks, or blocked stdout).
Because the internal queue is unbounded, a slow consumer causes log records to buffer in memory indefinitely. In long-running processes, this leads to massive RSS growth and eventual OOM, even if the application logic itself is memory-efficient.
Minimal Reproduction
I isolated this by simulating a slow sink (50ms latency) against a fast producer.
Environment:
- Python 3.14.0 / Loguru 0.7.3 / WSL2
psutilused for RSS sampling
import time
import psutil
from loguru import logger
logger.remove()
class SlowSink:
def write(self, message):
time.sleep(0.05) # Simulate slow I/O
def run(enqueue: bool) -> None:
logger.add(SlowSink(), level="SUCCESS", enqueue=enqueue)
proc = psutil.Process()
# 5MB payload to make the leak obvious
payload = "X" * (5 * 1024 * 1024)
print(f"=== Running with enqueue={enqueue} ===")
for i in range(200):
logger.success(payload)
if i % 50 == 0:
rss = proc.memory_info().rss / (1024 * 1024)
print(f"i={i:4d} rss={rss:7.1f} MB")
print("Logging done, sleeping 5s to drain...")
time.sleep(5)
rss = proc.memory_info().rss / (1024 * 1024)
print(f"Final rss={rss:7.1f} MB\n")
run(enqueue=False)
logger.remove()
run(enqueue=True)Output
=== Running with enqueue=False ===
i= 0 rss= 23.5 MB
i= 50 rss= 28.4 MB
i= 100 rss= 28.4 MB
Final rss= 28.4 MB
=== Running with enqueue=True ===
i= 0 rss= 44.3 MB
i= 50 rss= 74.5 MB
i= 100 rss= 69.5 MB
Final rss= 69.5 MB
With enqueue=False, the producer is blocked by the sink, so backpressure is implicit; memory stays flat (~28MB).
With enqueue=True, the producer dumps 5MB chunks into the queue faster than the thread can drain them. RSS spikes and stays high because the queue holds references to the objects, and CPython's allocator high-water mark keeps the memory mapped even after the queue eventually drains.
Root Cause
I ran tracemalloc on the reproduction script, which pinpointed the allocation site:
.../multiprocessing/queues.py:0: size=2064 KiB, count=348
It seems logger.add(..., enqueue=True) relies on multiprocessing.SimpleQueue (even for threads). This queue has no maxsize or capacity limit.
Suggested Fixes
Right now, enqueue=True is dangerous in production if there's any risk of the sink stalling (e.g., logging to a database or generic http endpoint).
- Documentation: At a minimum, the docs should explicitly warn that
enqueue=Truelacks backpressure and can OOM the process if the sink is slow. - Bounded Queue: Ideally,
enqueueshould accept aqueue_sizeint. If the queue is full, it needs a strategy (block the caller or drop the log with a warning to stderr).
Is there an existing workaround to enforce a queue limit without rewriting the threading logic manually?
Transparency: Findings are human-generated; text drafting assisted by AI.