Skip to content

fix: Resolve Sentry errors for insight validation, notifications, WebSocket, and telemetry#185

Merged
nkondratyk93 merged 2 commits into
mainfrom
fix/improvements
Mar 18, 2026
Merged

fix: Resolve Sentry errors for insight validation, notifications, WebSocket, and telemetry#185
nkondratyk93 merged 2 commits into
mainfrom
fix/improvements

Conversation

@nkondratyk93
Copy link
Copy Markdown
Contributor

Summary

  • Fix 5 Sentry issues affecting backend stability and user experience
  • Addresses NotNullViolation, AttributeError, WebSocket noise, and Grafana rate limiting

Changes

  • question_handler.py / action_handler.py: Add project_id validation before DB insert to prevent NotNullViolation when sessions use auto-generated IDs without linked projects (Fixes TELLMEMO-FASTAPI-2Y — 10 events, 4 users)
  • ticket_notification_service.py: Fix OrganizationRole enum — .admin/.owner.ADMIN, remove non-existent owner role (Fixes TELLMEMO-FASTAPI-3Z — 1 event)
  • websocket_live_insights.py: Treat WebSocket close code 1001 ("Going Away") as graceful disconnect instead of error (Fixes TELLMEMO-FASTAPI-3Q — 2 events)
  • config.py: Increase default OTLP metrics export interval from 60s to 120s (Fixes TELLMEMO-FASTAPI-30 — 11 events)
  • telemetry.py: Increase OTLP metric exporter timeout from 10s to 30s (Fixes TELLMEMO-FASTAPI-3V — 2 events)

Test plan

  • Verify support ticket creation sends notifications to admins correctly
  • Test live meeting insights with auto-generated session IDs (no project) — should log error and skip gracefully
  • Confirm WebSocket close code 1001 no longer appears as error in logs
  • Monitor Grafana Cloud for reduced 429 rate limiting after deploy

🤖 Generated with Claude Code

nkondratyk93 and others added 2 commits March 18, 2026 11:27
Remove project-specific content (architecture, commands, dependencies, MCP config)
and keep only the core LLM coding guidelines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Socket, and telemetry

- Add project_id validation in question_handler and action_handler to prevent
  NotNullViolation on live_meeting_insights inserts (FASTAPI-2Y)
- Fix OrganizationRole enum casing: .admin/.owner → .ADMIN (FASTAPI-3Z)
- Handle WebSocket close code 1001 as graceful disconnect (FASTAPI-3Q)
- Increase OTLP metrics export interval to 120s and timeout to 30s to avoid
  Grafana Cloud rate limiting and timeouts (FASTAPI-30, FASTAPI-3V)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Mar 18, 2026

PR Review

Overall this PR addresses real Sentry issues with targeted fixes. A few things worth discussing before merging.

Critical: CLAUDE.md replacement

The entire project-specific CLAUDE.md was replaced with a generic LLM behavioral guidelines document, removing critical project documentation: "DO NOT RUN FLUTTER APP/BACKEND" warnings, architecture overview, dependency constraint notes (retrofit cap and why), MCP server configuration, and design document references. This change is unrelated to the Sentry fixes and should be separated into its own PR or reverted if accidental.

ticket_notification_service.py — Missing OWNER role

The PR removes OrganizationRole.owner as "non-existent", but: (1) Is there an OrganizationRole.OWNER (uppercase) that should be included alongside ADMIN? (2) Should org owners receive ticket notifications? Silently removing owner notifications as a side effect of the enum case fix may be a behaviour regression worth calling out explicitly.

websocket_live_insights.py — Fragile string matching

"(1001," in error_msg matches against the exception's str() representation, which depends on the websockets library's internal formatting. A more robust approach would catch WebSocketDisconnect by type and inspect its code attribute. At minimum, add a comment explaining the expected error format so future maintainers understand why this string was chosen.

action_handler.py / question_handler.py — Verify callers handle None

Both handlers now return None when project_id is missing. Worth tracing callers to confirm they guard against None returns and won't try to use the result downstream.

config.py / telemetry.py — LGTM

Doubling the metrics export interval (60s to 120s) and tripling the timeout (10s to 30s) are clean, well-commented fixes for the Grafana rate limiting issues.

No regression tests

None of the bug fixes include automated tests — the test plan is manual-only. Unit tests for the project_id validation early-return paths in both handlers would help prevent future regressions.

Reviewed with Claude Code

@nkondratyk93 nkondratyk93 merged commit c54651a into main Mar 18, 2026
3 checks passed
@nkondratyk93 nkondratyk93 deleted the fix/improvements branch March 18, 2026 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant