Skip to content

Sentry sdk serializes stack frames before determining whether the error was already handled earlier. #4365

@jvineet

Description

@jvineet

How do you use Sentry?

Self-hosted/on-premise

Version

"2.25.1"

Steps to Reproduce

We noticed that a lot of our fastapi transactions in sentry were showing an unexpectedly long delay in latency after the typical request-response cycle for a request route. On investigation further, we realized this increase in latency was coming from how sentry was capturing the error event in those cases. On setting up spans around sentry's own internal event capture and serialization functions, we noticed that while sentry-sdk was only sending an error event to sentry server once, like we expected, it would still try to serialize the stack frames for subsequent event captures, before checking whether the event was already sent by sentry-sdk earlier. This ends up wasting cpu cycles and also needlessly slows down the response latency by a decent amount. Can we update the logic here to first check if the event needs to be sent over before we start serializing stack frames?

The issue is even more stark in a test/prod environment running a real app, that can easily have over 25-30 stack frames that need serializing and we are seeing a latency increase by up to 30 ms or more from these extra stack frame serializations that are not necessary.

Here is a real world example:
Image

I have a simple setup below to reproduce the issue with fastapi, though it is not limited to fastapi:

# main.py

import fastapi
import logging

import sentry_sdk
from sentry_sdk import serializer
from sentry_sdk.integrations import asgi

sentry_sdk.capture_event = sentry_sdk.trace(sentry_sdk.capture_event)
serializer.serialize = sentry_sdk.trace(serializer.serialize)
asgi._capture_exception = sentry_sdk.trace(asgi._capture_exception)

sentry_sdk.init("<your sentry dsn>", traces_sample_rate=1)

app = fastapi.FastAPI()

@app.get("/test")
def test() -> str:
    raise ValueError("Check the linked trace")


@app.exception_handler(ValueError)
def exception_handler(request: fastapi.Request, exc: Exception):
    # logging error here so that sentry-sdk can serialize stack frames, capture exception,
    # send it to sentry and also mark the error as already "handled"
    logging.exception("Log Exception")

    # reraise so it bubbles up to outer middleware/layers.
    raise exc

Run with uvicorn

$ uvicorn main:app --port 8000

Expected Result

We expect that stack frame serialization will happen only once and capture_event() will only send the exception to sentry once for the raised exception.

Actual Result

The capture_event() does seem to only send event to sentry once like we expect. But we see the stack frame serialization happens 3 separate times, the last two of which happened when capture_event ultimately realized it had already sent the exception over earlier.

Image

In this example trace, we can see that while the normal request-response cycle would have normally taken 8-9 ms, including the first error capture event, the overall txn ended up taking 21ms, mostly from the 2 subsequent stack frame serializations for the same error event.

The following snapshot expands one of the three auto-grouped spans, to show that these spans are indeed coming from the serialization step for every stack frame in the exception.

Image

The transaction is only linked to one exception, so these are not three separate error events either.

Image

Metadata

Metadata

Assignees

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions