Skip to content

Conversation

aminghadersohi
Copy link
Contributor

SUMMARY

Fixes the issue where Slack API rate limit errors (HTTP 429) were being swallowed or masked by generic error messages in the slack-sdk library's retry handler. This change improves observability and reliability of Slack integrations by adding explicit error detection and detailed logging.

Root Cause: The RateLimitErrorRetryHandler in the slack-sdk library can re-raise errors without proper context when certain conditions are met. When these errors bubble up to generic exception handlers, they get logged as "Failed to send a request to Slack API server" instead of clearly indicating a rate limit issue.

Changes:

  • Increased retry count from 2 to 4 to provide more buffer before failure
  • Added explicit error handling for HTTP 429 rate limit responses in get_channels()
  • Enhanced logging to include:
    • Retry-After header value from Slack API
    • Indication that retry handler may have failed or exhausted retries
    • Full error context for debugging
  • Added client initialization logging for debugging
  • Fixed edge case where ex.response could be a string instead of response object using hasattr() check

Impact: Rate limit errors will now be clearly visible in monitoring systems like Datadog with actionable information (Retry-After values, retry counts), making it much easier to diagnose and respond to Slack API rate limiting issues.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A - Backend logging improvement

TESTING INSTRUCTIONS

  1. All existing unit tests pass: pytest tests/unit_tests/utils/slack_test.py
  2. Pre-commit hooks pass: pre-commit run --files superset/utils/slack.py
  3. To manually test rate limiting:
    • Configure Slack integration in Superset
    • Trigger multiple rapid Slack API calls (e.g., by editing multiple alert configurations)
    • Observe error logs - they should now clearly indicate "Slack API rate limit exceeded (HTTP 429)" with Retry-After information instead of generic errors

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Addresses the issue where Slack API rate limit errors were being
swallowed or masked by generic error messages. This change improves
observability and reliability of Slack integrations.

Changes:
- Increased retry count from 2 to 4 to provide more buffer before failure
- Added explicit error handling for HTTP 429 rate limit responses
- Enhanced logging with Retry-After header and retry handler status
- Added client initialization logging for debugging
- Fixed edge case where ex.response could be a string instead of object

This will make rate limit errors clearly visible in monitoring systems
like Datadog, making it easier to diagnose and respond to Slack API
rate limiting issues.
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Files scanned
File Path Reviewed
superset/utils/slack.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Copy link

codecov bot commented Oct 9, 2025

Codecov Report

❌ Patch coverage is 10.52632% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.85%. Comparing base (ff80d4f) to head (c517358).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
superset/utils/slack.py 10.52% 17 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #35588       +/-   ##
===========================================
+ Coverage        0   71.85%   +71.85%     
===========================================
  Files           0      589      +589     
  Lines           0    43608    +43608     
  Branches        0     4718     +4718     
===========================================
+ Hits            0    31335    +31335     
- Misses          0    11035    +11035     
- Partials        0     1238     +1238     
Flag Coverage Δ
hive 46.26% <0.00%> (?)
mysql 70.88% <10.52%> (?)
postgres 70.94% <10.52%> (?)
presto 49.97% <0.00%> (?)
python 71.82% <10.52%> (?)
sqlite 70.53% <10.52%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dosubot dosubot bot added the logging Creates a UI or API endpoint that could benefit from logging. label Oct 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logging Creates a UI or API endpoint that could benefit from logging. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant