Skip to content

Zombie check_items processes accumulating on server #7

@rdhyee

Description

@rdhyee

Problem

There are 30+ orphaned check_items processes on doab-check.ebookfoundation.org, some dating back to 2025. The hourly cron launches a new check_items every hour but old ones never exit.

ubuntu     37287  0.0  2.9  88576 59896 ?  S  2025   0:07 ...check_items
ubuntu    138774  0.0  1.9  64720 38828 ?  S  2025   0:05 ...check_items
ubuntu    145906  0.0  3.2  92088 65572 ?  S  2025   0:05 ...check_items
...
ubuntu   2858393  0.0  1.6  62032 33252 ?  S  Jan20  0:10 ...check_items
ubuntu   2859463  0.0  1.9  68288 39204 ?  S  Jan20  0:11 ...check_items

Each process uses 30-65MB of memory. On a small droplet, 30+ zombies consume ~1-2GB, which caused an OOM kill when we tried to run a new check interactively.

Root Cause

The cron job runs every hour:

5 * * * * /home/ubuntu/doab-check/scripts/doab_check.sh >> .../cron_check.log

But check_items appears to hang on some links (before our timeout fix in PR #5, there were no timeouts on HTTP requests). These hung processes never complete, and the cron spawns a new one each hour regardless.

PR #5 (now deployed) adds timeouts to all HTTP requests, which should prevent new hangs. But the existing zombie processes need to be cleaned up.

Immediate Fix

Kill the accumulated zombies:

pkill -f "manage.py check_items"

Longer-term Fixes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions