Problem
Celery's internal loggers (celery.beat, celery.worker) are silenced on prod because regluit/settings/common.py sets:
LOGGING = {
'version': 1,
'disable_existing_loggers': True,
...
}
…and the loggers block doesn't redefine them. Result: /var/log/celery/beat.log has been empty since 2024-09-25 even though beat is firing ~4,400 tasks/day (verified via w1.log on 2026-04-30). The misleading silence in beat.log triggered a false-alarm investigation under #1138 — losing time and shaking confidence in the operational picture.
Why fix this
- Beat liveness becomes observable at the system that's actually responsible for it (its own log), instead of having to grep the worker log for indirect evidence
- Future failures will surface promptly rather than hiding behind a quirk of logging config
- Cheap, low-risk: pure logging config change, doesn't touch beat/worker behavior
Proposed change
In regluit/settings/common.py, add explicit logger entries:
LOGGING = {
...
'loggers': {
...existing...
'celery': {
'handlers': ['file'], # or a dedicated celery handler
'level': 'INFO',
'propagate': False,
},
'celery.beat': {
'handlers': ['file'],
'level': 'INFO',
'propagate': False,
},
'celery.worker': {
'handlers': ['file'],
'level': 'INFO',
'propagate': False,
},
},
}
Verify by tailing beat.log after a deploy + service restart — should see scheduler tick lines within the first max_interval (default 5 min) and Sending due task entries when jobs fire.
Bundled cleanup: CELERYBEAT_OPTS quoting bug
While we're touching the celery config, fix this in EbookFoundation/regluit-provisioning (/etc/default/celerybeat):
CELERYBEAT_OPTS="--schedule=/var/run/celery/celerybeat-schedule --concurrency=2"
Two issues:
- systemd's
ExecStart expands this inside double-quotes, so the whole string is passed as one argument — schedule file ends up literally named celerybeat-schedule --concurrency=2 (with embedded spaces). Cosmetic but ugly.
--concurrency=2 is a worker flag, not a beat flag — it's silently ignored.
Fix: drop the --concurrency=2 and let systemd pass --schedule as a single clean arg, or split CELERYBEAT_OPTS into separate vars and unquote in the unit.
Liveness watchdog (optional follow-on)
Once beat.log is informative again, the watchdog from #1138 becomes simpler:
# Alert if beat.log hasn't been written to in >10 min
[ $(($(date +%s) - $(stat -c %Y /var/log/celery/beat.log))) -lt 600 ] && echo OK || echo STALE
Worth adding as a host-level cron once observability is restored.
Related
Problem
Celery's internal loggers (
celery.beat,celery.worker) are silenced on prod becauseregluit/settings/common.pysets:…and the
loggersblock doesn't redefine them. Result:/var/log/celery/beat.loghas been empty since 2024-09-25 even though beat is firing ~4,400 tasks/day (verified viaw1.logon 2026-04-30). The misleading silence inbeat.logtriggered a false-alarm investigation under #1138 — losing time and shaking confidence in the operational picture.Why fix this
Proposed change
In
regluit/settings/common.py, add explicit logger entries:Verify by tailing
beat.logafter a deploy + service restart — should see scheduler tick lines within the firstmax_interval(default 5 min) andSending due taskentries when jobs fire.Bundled cleanup:
CELERYBEAT_OPTSquoting bugWhile we're touching the celery config, fix this in
EbookFoundation/regluit-provisioning(/etc/default/celerybeat):Two issues:
ExecStartexpands this inside double-quotes, so the whole string is passed as one argument — schedule file ends up literally namedcelerybeat-schedule --concurrency=2(with embedded spaces). Cosmetic but ugly.--concurrency=2is a worker flag, not a beat flag — it's silently ignored.Fix: drop the
--concurrency=2and let systemd pass--scheduleas a single clean arg, or splitCELERYBEAT_OPTSinto separate vars and unquote in the unit.Liveness watchdog (optional follow-on)
Once
beat.logis informative again, the watchdog from #1138 becomes simpler:Worth adding as a host-level cron once observability is restored.
Related
w1.loguntil this lands; falsification command: