Skip to content

Make SDK safe for working with multiprocessing with the forkserver start method #3937

Open
@n1ngu

Description

@n1ngu

How do you use Sentry?

Sentry Saas (sentry.io)

Version

2.19.2

Steps to Reproduce

On python 3.12 and a non-MacOS, non-Windows platform

import logging
import concurrent.futures
import sentry_sdk.integrations.logging

sentry_sdk.init(
    integrations=[
        sentry_sdk.integrations.logging.LoggingIntegration(event_level=logging.WARNING),
    ]
)
logging.captureWarnings(True)
logging.basicConfig(level=logging.DEBUG)
sentry_sdk.capture_message("hello")
pool = concurrent.futures.ProcessPoolExecutor()
pool.submit(sentry_sdk.capture_message, "world")

See the warning show up.

DeprecationWarning: This process is multi-threaded, use of fork() may lead to deadlocks in the child

Expected Result

  1. First and foremost, no fundamentally wrong use of os.fork() is introduced by Sentry SDK without the users knowing.
  2. If an application multiprocessing context uses the "forkserver" method, some safeguards are placed so that sentry won't start threads in the spawned fork server process
  3. No warning is triggered

Actual Result

Capturing warnings in sentry has recently surfaced this issue to me. I don't quite understand the subtleties of the fork method problems so please bear with me and refer to other sources for technical details

A conservative (safe but slow) workaround is to issue

multiprocessing.set_start_method("spawn")

in the application before any multiprocessing pool is started or use multiprocessing.get_context("spawn") for libraries used alongside sentry, or let such contexts be injected in your libraries in general.

AFAIU using the future "forkserver" default (on python 3.14) will not solve this problem because sentry is usually initialized very soon in the application configuration and I can't imagine a pool initialization procedure that will not initialize sentry in the fork server process in my applications. E.g. I am calling sentry_sdk.init on a sitecustomize.py module. I may understand this wrong, hopefully.

It could be argued that Sentry can't do anything about it because it is up to the application to opt-out of the default fork method, but maybe this would be worth being documented? I mean it would be reckless to configure a global multiprocessing.set_start_method from a library like sentry, and sentry does not directly use a process pool so it can neither take care to use a safe context internally, but sentry starts threads in the main process making any default use of process pools unsafe to applications that would be single-threaded otherwise.

Suffice to say, I have detected no actual deadlocks or other issues due to this (this has been a very concealed issue in posix platforms for years now)

related #291

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions