Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(relay): specify spool.enveloppe.max_backpressure_memory_percent configuration for handling relay's failing healthcheck #3635

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

aldy505
Copy link
Collaborator

@aldy505 aldy505 commented Mar 26, 2025

Although a fix is being rolled out, that does not mean every relay instance would suddenly be fixed. We would need to still provide a workaround for people to try out. Refer to this specific issue comment: #3330 (comment)

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

… configuration for handling relay's failing healthcheck

Although a fix is being rolled out, that does not mean every relay instance would suddenly be fixed. We would need to still provide a workaround for people to try out. Refer to this specific issue comment: getsentry#3330 (comment)
@aldy505 aldy505 requested review from BYK and aminvakil March 26, 2025 01:54
@iambriccardo
Copy link
Member

To add more context, which might be helpful for others experiencing the same issue, Relay has two memory settings:

  • max_memory_percent – This is the threshold at which health check failures will occur, and Relay will drop any incoming envelopes.
  • max_backpressure_memory_percent – This threshold is used by Relay's internal buffer (only when configured to spool to disk) to determine when the buffer should stop producing events for downstream services. This effectively stops events from passing through Relay, but they will accumulate in memory and be routinely spooled to disk. This mechanism increases Relay’s resilience in case of upstream failures or backlogs in internal service queues.

When both are set to 100% (1.0), the system will continue accepting data and the buffer will forward it without issues, but this will eventually cause the process to run out of memory (OOM).

As noted before, the buffer will stop producing events downstream only if max_backpressure_memory_percent is reached and a path is specified. Specifying a path instructs Relay to spool to a SQLite database, which will be created at that location. If no path is provided, the max_backpressure_memory_percent will not be read by Relay.

Copy link
Member

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejecting for my comment and mostly for @iambriccardo's comment.

@@ -14,10 +14,14 @@ processing:

# In some cases, relay might fail to find out the actual machine memory
# therefore it makes the healthcheck fail and events can't be submitted.
# See https://github.com/getsentry/self-hosted/issues/3330
# As a workaround, uncomment the following line:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following lines? We should also clearly mark where those lines start and end?

Co-authored-by: Riccardo Busetti <[email protected]>
Copy link
Collaborator

@aminvakil aminvakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default path? Is there one?

# As a workaround, uncomment the following line:
#
# health:
# max_memory_percent: 1.0
# spool:
# envelopes:
# path: "/your/path"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have an example of this path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should just be a path residing on the Docker volume... or since relay doesn't have a persisted volume, probably it's best to use /tmp/foo directory?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite unfamiliar with this change, so could you please bear with me and tell me, is it going to be more than 4GB?

I've seen odd situations when using /tmp and filling it with files bigger than 4GB size.

If that's not the case, then correct, /tmp/foo seems right to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who not mount a volume to relay for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs, the default size is 500MB. We should be fine, no need for another volume. https://docs.sentry.io/product/relay/options/#spooling

@aldy505 aldy505 requested review from BYK and aminvakil March 28, 2025 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants