|
| 1 | +--- |
| 2 | +name: cron-doctor |
| 3 | +description: "Diagnose and validate cron expressions before they ship. Catches the five silent death-traps: impossible dates that never fire, OR-semantics that fire too often, midnight spikes, uneven step drift, and leap-year February 29." |
| 4 | +category: devops |
| 5 | +risk: safe |
| 6 | +source: community |
| 7 | +source_repo: takeaseatventure/devops-skills |
| 8 | +source_type: community |
| 9 | +date_added: "2026-06-26" |
| 10 | +author: takeaseat |
| 11 | +tags: [cron, crontab, scheduling, devops, debugging, kubernetes, validation] |
| 12 | +tools: [claude, cursor, codex, gemini, opencode] |
| 13 | +license: "MIT" |
| 14 | +license_source: "https://github.com/takeaseatventure/devops-skills/blob/main/LICENSE" |
| 15 | +--- |
| 16 | + |
| 17 | +# cron-doctor |
| 18 | + |
| 19 | +## Overview |
| 20 | + |
| 21 | +Cron is deceptively error-prone. The failure mode is **silent** — a syntactically |
| 22 | +valid expression that simply never fires, or fires far more often than intended. |
| 23 | +`0 0 30 2 *` parses cleanly and then sits dead forever (February has no 30th). |
| 24 | +`0 0 1,15 * 1` looks like "1st and 15th if Monday" but actually means "1st, 15th, |
| 25 | +**OR** every Monday" — ~6 fires/month instead of ~2. |
| 26 | + |
| 27 | +This skill teaches an agent to catch those before they reach production. It comes |
| 28 | +with a zero-dependency validation engine (`scripts/cron-engine.js`, no install |
| 29 | +needed) that parses, describes, deep-validates, and computes next fire times. |
| 30 | + |
| 31 | +## When to Use This Skill |
| 32 | + |
| 33 | +- Use when a user writes, edits, reviews, or deploys a cron expression — in a |
| 34 | + crontab, a Kubernetes `CronJob`, a GitHub Actions `schedule`, an Airflow DAG, |
| 35 | + a Celery beat schedule, a systemd timer, or any scheduled task. |
| 36 | +- Use when debugging a job that "didn't fire" or "fired at the wrong time." |
| 37 | +- Use when a user asks "what does this cron expression mean?" or "when will this |
| 38 | + run next?" or "how often does this run per year?" |
| 39 | +- Use when reviewing a CI/CD pipeline or infrastructure config that contains a |
| 40 | + `schedule` field. |
| 41 | +- Use when a user pastes a 5-field cron expression and asks for a sanity check. |
| 42 | + |
| 43 | +## How It Works |
| 44 | + |
| 45 | +### Step 1: Parse the expression |
| 46 | + |
| 47 | +Split on whitespace into 5 fields: minute, hour, day-of-month, month, day-of-week. |
| 48 | +Confirm valid ranges: |
| 49 | + |
| 50 | +| Field | Position | Range | Notes | |
| 51 | +|-------|----------|-------|-------| |
| 52 | +| minute | 1 | 0–59 | | |
| 53 | +| hour | 2 | 0–23 | | |
| 54 | +| day-of-month | 3 | 1–31 | | |
| 55 | +| month | 4 | 1–12 | names (JAN–DEC) accepted | |
| 56 | +| day-of-week | 5 | 0–7 | 0 and 7 both = Sunday; names (SUN–SAT) accepted | |
| 57 | + |
| 58 | +### Step 2: Describe it in plain English |
| 59 | + |
| 60 | +State what the user *thinks* it does vs. what it *actually* does. Be explicit |
| 61 | +about OR-vs-AND semantics for day-of-month + day-of-week (see death-trap #2). |
| 62 | + |
| 63 | +### Step 3: Run the trap checklist |
| 64 | + |
| 65 | +Check the five death-traps below and flag any that apply. |
| 66 | + |
| 67 | +### Step 4: Calculate next runs and annual fire count |
| 68 | + |
| 69 | +Compute the next 5 fire times as concrete dates so the user can verify the |
| 70 | +schedule behaves as expected. Estimate annual fire count — a schedule that fires |
| 71 | +365×/year vs. 12×/year is a ~30× cost and load difference. |
| 72 | + |
| 73 | +## The Five Cron Death-Traps |
| 74 | + |
| 75 | +These are the bugs that pass `crontab -l` validation but break in production. |
| 76 | + |
| 77 | +### 1. Impossible dates — the "never fires" bug |
| 78 | + |
| 79 | +``` |
| 80 | +0 0 30 2 * |
| 81 | +``` |
| 82 | + |
| 83 | +**Valid syntax. Never fires.** February has no 30th. This schedule is a dead job |
| 84 | +that silently sits forever. The same applies to day 31 in any 30-day month: |
| 85 | +`0 0 31 4 *`, `0 0 31 6 *`, `0 0 31 9 *`, `0 0 31 11 *`. |
| 86 | + |
| 87 | +**Fix:** use `0 0 28-31 * *` and check for end-of-month in the script, or use `L` |
| 88 | +(last day) syntax if your scheduler supports it. |
| 89 | + |
| 90 | +### 2. OR-semantics — the "fires too often" bug |
| 91 | + |
| 92 | +``` |
| 93 | +0 0 1,15 * 1 |
| 94 | +``` |
| 95 | + |
| 96 | +**Does NOT mean** "midnight on the 1st and 15th if it's Monday." |
| 97 | +**Does mean** "midnight on the 1st, the 15th, **OR** every Monday." That's ~6 |
| 98 | +fires/month instead of ~2. |
| 99 | + |
| 100 | +This is the single most misunderstood cron rule. When **both** day-of-month AND |
| 101 | +day-of-week are restricted (neither is `*`), cron uses OR logic, not AND. |
| 102 | + |
| 103 | +**Fix:** if you need "1st and 15th only if Monday," run daily and check in the |
| 104 | +script: |
| 105 | + |
| 106 | +```bash |
| 107 | +0 0 * * 1 [ "$(date +%d)" = "01" -o "$(date +%d)" = "15" ] && your-command |
| 108 | +``` |
| 109 | + |
| 110 | +### 3. Midnight spike — the "everything at once" bug |
| 111 | + |
| 112 | +``` |
| 113 | +0 0 * * * |
| 114 | +``` |
| 115 | + |
| 116 | +Every job scheduled at `0 0` competes for resources at exactly 00:00. Database |
| 117 | +backups, log rotations, cert renewals, report generation — all fire simultaneously. |
| 118 | +This causes load spikes, connection-pool exhaustion, and cascading timeouts. |
| 119 | + |
| 120 | +**Fix:** stagger jobs across the hour. Use `17 2 * * *` or `43 3 * * *` instead of |
| 121 | +`0 0`. Jitter is your friend. |
| 122 | + |
| 123 | +### 4. Uneven steps — the "drift" bug |
| 124 | + |
| 125 | +``` |
| 126 | +*/7 * * * * |
| 127 | +``` |
| 128 | + |
| 129 | +**Does NOT mean** "every 7 minutes evenly." It means "every 7 minutes starting at |
| 130 | +0, then resets at 60." So: 0, 7, 14, 21, 28, 35, 42, 49, 56 — then 0 again |
| 131 | +(a 4-minute gap). The intervals drift: 7,7,7,7,7,7,7,7,**4**. |
| 132 | + |
| 133 | +**Fix:** 60 is not divisible by 7. Use step values that divide 60 evenly: `*/5`, |
| 134 | +`*/10`, `*/15`, `*/20`, `*/30`. If you truly need every-7-minutes, use a loop with |
| 135 | +`sleep 420`. |
| 136 | + |
| 137 | +### 5. Leap-year February 29 — the "annual surprise" |
| 138 | + |
| 139 | +``` |
| 140 | +0 0 29 2 * |
| 141 | +``` |
| 142 | + |
| 143 | +Fires only on leap years — February 29, 2024 / 2028 / 2032… If someone writes this |
| 144 | +expecting "end of February," they'll be confused for 3 out of every 4 years. |
| 145 | + |
| 146 | +**Fix:** use `0 0 28 2 *` and handle the 29th case in the script if needed. |
| 147 | + |
| 148 | +## Using the validation script |
| 149 | + |
| 150 | +This skill ships a zero-dependency engine at `scripts/cron-engine.js` (Node.js, no |
| 151 | +`npm install` needed). You can use it programmatically or from the CLI: |
| 152 | + |
| 153 | +```javascript |
| 154 | +// Programmatic — Node.js, zero dependencies |
| 155 | +const { describe, validate, nextRuns, formatNextRuns } = require('./scripts/cron-engine.js'); |
| 156 | + |
| 157 | +// Parse + describe -> returns { text, error, parsed } |
| 158 | +const d = describe('0 0 30 2 *'); |
| 159 | +console.log(d.text); // "At 00:00, on day-of-month 30 in in FEB" |
| 160 | + |
| 161 | +// Deep validation -> catches the traps |
| 162 | +const result = validate('0 0 30 2 *'); |
| 163 | +console.log(result.valid); // true (syntax is valid) |
| 164 | +console.log(result.observations); // includes the "never fires" insight |
| 165 | +console.log(result.suggestions); // e.g. "Midnight is a common spike..." |
| 166 | + |
| 167 | +// Next 5 fire times -> returns Date[] |
| 168 | +const runs = nextRuns('0 9 * * 1-5', new Date(), 5); |
| 169 | +console.log(formatNextRuns(runs, new Date())); // [{ date, relative, formatted }, ...] |
| 170 | +``` |
| 171 | + |
| 172 | +```bash |
| 173 | +# CLI (via the bundled wrapper) |
| 174 | +node scripts/cli.js describe "*/5 * * * *" |
| 175 | +node scripts/cli.js validate "0 0 30 2 *" |
| 176 | +node scripts/cli.js next "0 9 * * 1-5" 5 |
| 177 | +``` |
| 178 | + |
| 179 | +## Common cron presets |
| 180 | + |
| 181 | +| Expression | Description | Use case | |
| 182 | +|-----------|-------------|----------| |
| 183 | +| `*/5 * * * *` | Every 5 minutes | Health checks, polling | |
| 184 | +| `0 * * * *` | Every hour | Hourly aggregation | |
| 185 | +| `0 */2 * * *` | Every 2 hours | Semi-frequent sync | |
| 186 | +| `0 9 * * 1-5` | 9am Mon–Fri | Business-hours task | |
| 187 | +| `0 2 * * *` | 2am daily | Off-peak batch (avoid midnight) | |
| 188 | +| `0 0 * * 0` | Midnight Sunday | Weekly maintenance | |
| 189 | +| `0 0 1 * *` | Midnight 1st of month | Monthly report | |
| 190 | +| `0 0 1 1 *` | Midnight Jan 1st | Annual task | |
| 191 | + |
| 192 | +## Best Practices |
| 193 | + |
| 194 | +- ✅ Always provide the plain-English description AND run the trap checklist. |
| 195 | +- ✅ Stagger midnight jobs to avoid the spike. |
| 196 | +- ✅ Prefer step values that divide 60 evenly (`*/5`, `*/15`, `*/30`). |
| 197 | +- ✅ Add a comment above every crontab line explaining intent. |
| 198 | +- ✅ Set an explicit timezone (`CRON_TZ`) on schedulers that support it. |
| 199 | +- ❌ Don't trust `crontab -l` validation — it only checks syntax, not semantics. |
| 200 | +- ❌ Don't restrict both day-of-month and day-of-week without confirming OR-logic. |
| 201 | +- ❌ Don't schedule everything at `0 0`. |
| 202 | + |
| 203 | +## Common Pitfalls |
| 204 | + |
| 205 | +- **Problem:** "My cron job isn't running." |
| 206 | + **Solution:** Check for an impossible date (trap #1) and confirm the daemon is |
| 207 | + running (`service cron status` / `systemctl status crond`). Verify the file |
| 208 | + ends with a newline and has correct ownership. |
| 209 | + |
| 210 | +- **Problem:** "My job runs far more often than expected." |
| 211 | + **Solution:** You hit OR-semantics (trap #2). If both day-of-month and |
| 212 | + day-of-week are set, cron ORs them. Move one to `*` or guard in-script. |
| 213 | + |
| 214 | +- **Problem:** "Intervals are uneven — sometimes 7 min, sometimes 4." |
| 215 | + **Solution:** Step value doesn't divide 60 evenly (trap #4). Use a divisor of 60. |
| 216 | + |
| 217 | +- **Problem:** "My job works locally but not in the cluster." |
| 218 | + **Solution:** Timezone mismatch. Kubernetes `CronJob` and GitHub Actions default |
| 219 | + to UTC. Confirm `timeZone` / `TZ` is set as intended. |
| 220 | + |
| 221 | +## Limitations |
| 222 | + |
| 223 | +- This skill targets standard 5-field cron as implemented by Vixie cron, systemd |
| 224 | + timers, Kubernetes `CronJob`, GitHub Actions `schedule`, and most libraries. It |
| 225 | + does **not** validate Quartz 6/7-field expressions with seconds/years, nor |
| 226 | + non-standard `@reboot` / `L` / `#` extensions without a note. |
| 227 | +- Estimated annual fire counts assume a non-leap reference year; February 29 |
| 228 | + schedules (trap #5) are flagged explicitly. |
| 229 | +- This skill does not replace environment-specific validation, testing, or expert |
| 230 | + review. Stop and ask for clarification if required inputs, permissions, or |
| 231 | + safety boundaries are missing. |
| 232 | + |
| 233 | +## Related Skills |
| 234 | + |
| 235 | +- `docker-expert` — when the cron job runs inside a container and the issue is the |
| 236 | + container/entrypoint rather than the schedule. |
| 237 | +- `kubernetes-deployment` — when validating a `CronJob` manifest's `spec.schedule` |
| 238 | + field alongside the broader resource config. |
| 239 | + |
| 240 | +## Security & Safety Notes |
| 241 | + |
| 242 | +This skill is read-only and `risk: safe`. The validation script performs no file |
| 243 | +writes, network calls, or mutations — it only parses and computes. It is safe to |
| 244 | +run against any cron expression without preconditions. |
0 commit comments