-
Notifications
You must be signed in to change notification settings - Fork 473
Description
Description
A critical bug exists in django-celery-beat v2.8.1 where the crontab optimization feature permanently excludes periodic tasks from execution if they fall outside a 2-hour window calculated at scheduler startup.
Introduced in commit: 87c0597
The _get_crontab_exclude_query() method creates a static time window that never updates, causing tasks scheduled more than 2 hours from the current server time at startup to never be loaded or executed.
Environment
- django-celery-beat version: 2.8.1
- Python version: 3.9+ (uses zoneinfo)
- Django version: (any supported version)
- Celery version: (compatible with django-celery-beat 2.8.1)
Steps to Reproduce
- Start celery beat with DatabaseScheduler at time T (e.g., 10:00 AM)
- Create a crontab task scheduled to run at T+4 hours (e.g., 2:00 PM)
- Wait for the scheduled time
- Observe that the task never executes
- Check logs - the task is not loaded into the scheduler's task list
Expected Behavior
All enabled crontab tasks should be evaluated for execution regardless of when the scheduler started, and tasks should run at their scheduled times.
Actual Behavior
Tasks scheduled outside the initial 2-hour window (calculated at startup) are permanently excluded from the scheduler and never execute.
Root Cause
The optimization in enabled_models_qs() calls _get_crontab_exclude_query() which calculates a static time window:
def _get_crontab_exclude_query(self):
"""
Build a query to exclude crontab tasks based on their hour value,
adjusted for timezone differences relative to the server.
"""
# This calculation happens ONCE at startup and never updates
server_time = aware_now() # Static time calculation
server_hour = server_time.hour
# Creates a static 2-hour window that never changes
hours_to_include = [
(server_hour + offset) % 24 for offset in range(-2, 3)
]
hours_to_include += [4] # celery's default cleanup task
# ... rest of methodThe enabled_models_qs() method calls this optimization:
def enabled_models_qs(self):
next_schedule_sync = now() + datetime.timedelta(
seconds=SCHEDULE_SYNC_MAX_INTERVAL
)
exclude_clock_tasks_query = Q(
clocked__isnull=False,
clocked__clocked_time__gt=next_schedule_sync
)
# This creates a static filter that never updates
exclude_cron_tasks_query = self._get_crontab_exclude_query()
exclude_query = exclude_clock_tasks_query | exclude_cron_tasks_query
return self.Model.objects.enabled().exclude(exclude_query)Problem Analysis
The time window calculation should happen dynamically on each scheduler tick, but instead it's calculated once and cached. This means:
- Scheduler starts at 10:00 AM → creates window for 8:00 AM - 12:00 PM
- Task scheduled for 2:00 PM is excluded from initial load
- Window never updates, so 2:00 PM task is permanently excluded
- Task never executes even when 2:00 PM arrives
Workaround
Temporarily disable the crontab optimization by modifying enabled_models_qs():
def enabled_models_qs(self):
next_schedule_sync = now() + datetime.timedelta(
seconds=SCHEDULE_SYNC_MAX_INTERVAL
)
exclude_clock_tasks_query = Q(
clocked__isnull=False,
clocked__clocked_time__gt=next_schedule_sync
)
# Skip crontab optimization to avoid the bug
return self.Model.objects.enabled().exclude(exclude_clock_tasks_query)Impact
This bug affects any production deployment where:
- Crontab tasks are scheduled more than 2 hours from scheduler startup time
- Long-running beat processes that don't restart frequently
- Tasks with varied schedule times throughout the day
The optimization was introduced to improve database query performance but inadvertently breaks core scheduling functionality.