Skip to content

Conversation

tharani694
Copy link

@tharani694 tharani694 commented Oct 12, 2025

SUMMARY

Currently, transient failures in async query tasks cause jobs to fail immediately, leading to a poor user experience in SQL Lab and Explore views. By adding retry logic, temporary network or DB issues are automatically retried, improving system reliability.

This PR introduces retry logic to Superset’s backend Celery tasks that handle asynchronous queries. Currently, load_chart_data_into_cache and load_explore_json_into_cache fail immediately on transient errors like database operational errors or network timeouts. This PR enhances reliability by allowing automatic retries.

Changes Made:

Added retry logic to both tasks:

  • Retries on OperationalError, ConnectionError, Timeout
  • max_retries=3, retry_backoff=True, retry_backoff_max=60s

Bind Celery tasks (bind=True) to access retry count (self.request.retries) for logging

Preserved existing functionality:

  • Soft timeouts (SoftTimeLimitExceeded)
  • SupersetVizException handling
  • Cache creation and update logic

Type annotations added for self to satisfy mypy and pre-commit
No frontend or documentation changes included
Backend-only PR: frontend pre-commit checks intentionally skipped.

TESTING

  • Passed Python pre-commit hooks (except frontend hooks).
  • Verified cache update and async_query_manager updates work correctly.
  • To Manually test in a local Celery + Superset environment, simulated transient DB/network errors which retried task successfully.
  • Passed Type checks with mypy

Related Issue: #30351
– Addresses retry mechanisms for asynchronous tasks in SQL Lab.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Functionality Incorrect task name in decorator ▹ view ✅ Fix detected
Files scanned
File Path Reviewed
superset/tasks/async_queries.py
superset/config.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment on lines 133 to 142
@celery_app.task(
name="load_chart_data_into_cache",
soft_time_limit=query_timeout,
bind=True,
autoretry_for=RETRYABLE_EXCEPTIONS,
retry_backoff=current_app.config.get("ASYNC_TASK_RETRY_BACKOFF", True),
retry_backoff_max=current_app.config.get("ASYNC_TASK_RETRY_BACKOFF_MAX", 60),
max_retries=current_app.config.get("ASYNC_TASK_MAX_RETRIES", 3),
)
def load_explore_json_into_cache( # pylint: disable=too-many-locals

This comment was marked as resolved.

@dosubot dosubot bot added the global:async-query Related to Async Queries feature label Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

global:async-query Related to Async Queries feature size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant