Skip to content

Sentry background worker is chronically blocking async event loop when many exceptions are raised #2824

Open
@cpvandehey

Description

@cpvandehey

How do you use Sentry?

Self-hosted/on-premise

Version

1.40.6

Steps to Reproduce

Hello! And thanks for reading my ticket :)

The python sentry client is a synchronous client library that is retrofitted to fit the async model (by spinning off separate threads to avoid disrupting the event loop thread -- see background worker (1) for thread usage).

Under healthy conditions, the sentry client doesn’t need to make many web requests. However, if conditions become rocky and exceptions are frequently raised (caught or uncaught), the sentry client may become an extreme inhibitor to the app event loop (assuming high sample rate). This is due to the necessary OS thread context switching that effectively pauses/blocks the event loop to work on other threads (i.e the background worker (1)). This is not a recommended pattern (obviously) due to the costs of switching threads, but can be useful for quickly/lazily retrofitting sync code.

Relevant flow - in short:
Every time an exception is raised (caught or uncaught) in my code, a web request is immediately made to dump the data to sentry when sampled. Since sentry’s background worker is thread based (1), this will trigger an thread context switch and then a synchronous web request to dump the data to sentry. When applications receive many exceptions in a short period of time, this becomes a context switching nightmare.

Suggestion:
In an ideal world, sentry would asyncify its Background worker to use a task (1) and its transport layer (2) would use aiohttp. I don't think this is of super high complexity, but I could be wrong.

An immediate workaround could be made with more background worker control. If sentry’s background worker made web requests to dump data at configurable intervals, it would behave far more efficiently for event loops apps. At the moment, the background worker always dumps data immediately with regards to exceptions. In my opinion, if sentry is flushing data at app exit, having a 60 second timer to dump data would alleviate most of the symptoms I described above without ever losing data (albeit it would be up to 60 seconds slower).

(1) -

class BackgroundWorker(object):

(2) -

response = self._pool.request(

Expected Result

I expect to have less thread context switching when using sentry.

Actual Result

I see a lot of thread context switching when there are high exception rates.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions