Skip to content

Protect AdminEmailHandler from bot-amplified error-email floods #1127

@rdhyee

Description

@rdhyee

Background

On 2026-04-23, test.unglue.it triggered a ~7,700-email flood to production admins over several hours. Root cause was a template missing on a stale test deploy; bots hitting /accounts/register/ generated 500s and each one produced an admin email. See EbookFoundation/security-private#11 for the full incident post-mortem.

The specific template bug has been fixed on test, but the incident exposed two structural amplification vectors that need patching so this class of incident doesn't recur:

Problem

  1. AdminEmailHandler has no rate limit. A single broken view + bot traffic produces one admin email per failed request. Thousands per day during bot hammering. Django's default AdminEmailHandler in django.utils.log has no built-in dedupe or throttling.

  2. Staging environments share ADMINS with production. settings/common.py defines ADMINS once and non-production environments inherit it. Any broken view on test.unglue.it or dj42.unglue.it during development will fire emails to the same addresses that monitor production. A broken staging environment becomes a DoS vector against the production admin mailbox.

  3. Admin email addresses can silently stop reaching the intended recipient. In this incident, one of the ADMINS addresses no longer had an active mailbox. Errors continued to be logged against an address nobody was reading, eliminating any chance of in-loop detection. (Tracked separately — already being updated in the same PR as part of the fix.)

Proposal

In regluit (code side)

  • Add a RateLimitFilter utility (small, per-process, dedupe-by-signature within a configurable window) and attach it to the mail_admins handler in LOGGING.
  • Update the ADMINS address that had gone stale.

In regluit-provisioning (env side)

  • Add a disable_admin_emails group_var (default false).
  • Update roles/regluit_prod/templates/prod.py.j2 to render ADMINS = [] when disable_admin_emails is true — overriding the settings/common.py value on non-prod environments.
  • Set disable_admin_emails: true in group_vars/dj42/vars.yml and group_vars/test/vars.yml.

Outcome

After both PRs merge:

  • Production still gets admin emails for real errors, but the rate limit filter prevents any single error signature from flooding the inbox (first occurrence → email; subsequent in a 60-second window → suppressed).
  • test.unglue.it and dj42.unglue.it no longer send admin emails at all. Staging errors stay on staging.
  • The ADMINS address is one that actually reaches a human.

Principle

Production alerting should fire only on production problems. Any pathway by which a broken staging environment can page production admins is a latent DoS vector against the people responsible for prod.

Related

Scope deferred

  • Sentry (or equivalent structured error aggregation) — proper replacement for email-as-error-channel. Separate evaluation. Not blocking.
  • Cross-worker rate-limit coordination via Redis — per-process limiter here is ~80% solution; cross-worker coordination is a potential follow-up if bot traffic scales up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions