You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: agent lifecycle reliability — stale claim, stderr, heartbeat, worktree
Three root causes for "agent died without output" failures:
1. Stale claim used task.created_at instead of actual claim time — tasks
created hours ago were immediately marked stale when freshly claimed.
Added claimed_at field to Task model, set on claim, used for timeout.
Default timeout increased from 10m to 15m.
2. Agent stderr redirected to /dev/null — auth failures, MCP config
errors, and CLI crashes were invisible. Now captured in .stderr.log.
3. Heartbeat file touched after spawn instead of before — race window
where running agent had no heartbeat and looked dead to watchdog.
Now touched before adapter.spawn() call.
4. Worktree creation failure didn't block spawn — agent launched into
nonexistent directory and crashed silently. Now raises SpawnError
so retry logic handles it properly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments