feat(seer): Suppress re-triage of skipped issues in night shift by trevor-e · Pull Request #114915 · getsentry/sentry

trevor-e · 2026-05-05T20:49:45Z

Persist SKIP verdicts from night-shift triage to Redis with a 3.5-day TTL, then exclude those group ids from candidate selection on subsequent nightly runs. Stops the agent from repeatedly re-evaluating issues it already classified as not worth fixing. The TTL exists at all because it's possible we may get new information in a few days (better tag distribution, new recommended event, etc) so we do eventually want to re-run our triage against it.

The TTL is padded past 3 days so nightly-run jitter cannot expire a key right at the boundary, guaranteeing suppression for the next 3 runs.

Persist SKIP verdicts to a Redis cache keyed by group id with a 3.5-day TTL, then exclude those ids from candidate selection on subsequent nightly runs. Stops the agent from repeatedly re-evaluating the same issues it already classified as not worth fixing, saving compute and quota. The TTL is padded past 3 days so nightly-run jitter cannot expire a key right at the boundary; this guarantees the next 3 runs suppress the issue. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

The old name suggested filtering out recently-skipped ids, but the function actually returns the subset that ARE recently skipped. Rename so the name matches the return value. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

Mark the group via mark_skipped() before the run so the test exercises the real read path through Redis instead of stubbing recently_skipped. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

sentry · 2026-05-05T21:31:41Z

+    for v in triage_response.verdicts:
+        if v.group_id in groups_by_id and v.action == TriageAction.SKIP:
+            mark_skipped(v.group_id)


Bug: A Redis connection failure in mark_skipped after the main agent logic will cause an unhandled exception, discarding all previously computed triage results.
_{Severity: MEDIUM}

Suggested Fix

Wrap the mark_skipped call in its own try/except block to catch potential Redis connection errors. Log the error for observability but do not re-raise it, allowing the function to return the successfully computed triage results. This ensures that failures in the caching optimization do not cause the loss of primary results.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/sentry/tasks/seer/night_shift/agentic_triage.py#L195-L197 Potential issue: The `mark_skipped` function is called outside the `try/except` block that wraps the expensive Seer agent interactions. If a Redis connection error occurs during this call, the exception is not handled locally. It propagates up to the `run_night_shift_execution` function, which then marks the entire run as failed and discards all the triage results (e.g., `AUTOFIX`, `ROOT_CAUSE_ONLY`) that were successfully generated by the agent. This wastes significant LLM computation due to a failure in a non-critical optimization step.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

chromy

Lgtm

github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label May 5, 2026

trevor-e and others added 3 commits May 5, 2026 17:07

ref(seer): Replace magic 86400 with timedelta in skip_cache test

3a40536

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

test(seer): Use real skip cache instead of patch in night shift test

124b7c0

Mark the group via mark_skipped() before the run so the test exercises the real read path through Redis instead of stubbing recently_skipped. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

trevor-e marked this pull request as ready for review May 5, 2026 21:23

trevor-e requested a review from a team as a code owner May 5, 2026 21:23

sentry Bot reviewed May 5, 2026

View reviewed changes

chromy approved these changes May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(seer): Suppress re-triage of skipped issues in night shift#114915

feat(seer): Suppress re-triage of skipped issues in night shift#114915
trevor-e wants to merge 4 commits intomasterfrom
telkins/night-shift-skip-cache

trevor-e commented May 5, 2026 •

edited

Loading

Uh oh!

sentry Bot May 5, 2026

Uh oh!

chromy left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

trevor-e commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sentry Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

chromy left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

trevor-e commented May 5, 2026 •

edited

Loading