Skip to content

Fix test_task_announcements flake with immediate heartbeat#216

Merged
chrisguidry merged 1 commit intomainfrom
immediate-heartbeat
Dec 9, 2025
Merged

Fix test_task_announcements flake with immediate heartbeat#216
chrisguidry merged 1 commit intomainfrom
immediate-heartbeat

Conversation

@chrisguidry
Copy link
Copy Markdown
Owner

Workers now send their first heartbeat immediately on startup instead of
waiting for the first interval. This fixes a race condition where
task_workers() could be called before any heartbeat was sent on heavily
loaded CI machines.

Changes:

  • Reorder _heartbeat() to send heartbeat before sleeping
  • Update test_worker_publishes_depth_gauges to allow multiple heartbeats
  • Clean up temp worker heartbeat data in key_leak_checker fixture

Flake: https://github.com/chrisguidry/docket/actions/runs/20075820763/job/57589737809

🤖 Generated with Claude Code

Workers now send their first heartbeat immediately on startup instead of
waiting for the first interval. This fixes a race condition where
`task_workers()` could be called before any heartbeat was sent on heavily
loaded CI machines.

Changes:
- Reorder `_heartbeat()` to send heartbeat before sleeping
- Update `test_worker_publishes_depth_gauges` to allow multiple heartbeats
- Clean up temp worker heartbeat data in `key_leak_checker` fixture

Flake: https://github.com/chrisguidry/docket/actions/runs/20075820763/job/57589737809

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Dec 9, 2025

Claude finished @chrisguidry's task —— View job


Code Review for PR #216

Review Checklist

  • Examine the heartbeat timing change in worker.py
  • Review test adjustments in test_instrumentation.py
  • Check cleanup additions in conftest.py
  • Verify the fix addresses the race condition
  • Check for potential issues or edge cases
  • Confirm test coverage remains at 100%

Summary

This PR effectively fixes a race condition where task_workers() could be called before any heartbeat was sent, particularly on heavily loaded CI machines. The solution is elegant and correct.

Detailed Review

✅ Strengths

  1. Correct Fix: Moving the sleep to the end of the heartbeat loop ensures the first heartbeat is sent immediately upon worker startup. This eliminates the race condition where task_workers() might be called before any heartbeat exists.

  2. Minimal Change: The fix is surgical - just reordering two lines in src/docket/worker.py:915-976. This minimizes risk while solving the problem.

  3. Test Adjustments:

    • The change from assert_called_once_with to assert_called_with in test_worker_publishes_depth_gauges correctly accounts for multiple heartbeats now being possible within the test window
    • The cleanup in conftest.py properly removes temporary worker heartbeat data that could pollute other tests
  4. Backward Compatibility: The change doesn't alter the heartbeat interval behavior - workers still heartbeat at the configured interval, just starting immediately instead of after the first interval.

✅ Code Quality

  • The code follows existing patterns and conventions
  • No new dependencies or complexity introduced
  • Clear and focused changes

✅ Performance

  • No performance concerns - the heartbeat still runs at the same interval
  • Actually improves startup time for worker availability detection

✅ Security

  • No security implications in these changes

✅ Test Coverage

  • The existing tests continue to pass with the adjustments
  • Test coverage should remain at 100% as no new code paths were introduced
  • The fix directly addresses the flaky test issue shown in the linked CI run

💭 Minor Suggestions (Optional)

  1. Consider adding a comment: While the change is clear, a brief comment above the heartbeat method explaining why the heartbeat is sent first could help future maintainers:

    async def _heartbeat(self) -> None:
        # Send heartbeat immediately on startup, then at regular intervals
        while True:
  2. Test robustness: The test now allows multiple heartbeats which is correct, but you might want to assert that at least one heartbeat occurred to ensure the immediate heartbeat behavior is working.

Conclusion

This is a well-crafted fix that solves the race condition cleanly. The immediate heartbeat on startup is the right approach, and the test adjustments properly account for the new behavior. The cleanup in conftest.py is a nice addition to prevent test pollution.

Approved ✅ - Ready to merge.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Dec 9, 2025

📚 Documentation has been built for this PR!

You can download the documentation directly here:
https://github.com/chrisguidry/docket/actions/runs/20076999992/artifacts/4815985762

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (2507be2) to head (dd032fa).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##              main      #216   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           48        48           
  Lines         7740      7745    +5     
  Branches       402       403    +1     
=========================================
+ Hits          7740      7745    +5     
Flag Coverage Δ
python-3.10 100.00% <100.00%> (ø)
python-3.11 98.95% <75.00%> (+<0.01%) ⬆️
python-3.12 100.00% <100.00%> (ø)
python-3.13 100.00% <100.00%> (ø)
python-3.14 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/docket/worker.py 100.00% <100.00%> (ø)
tests/conftest.py 100.00% <100.00%> (ø)
tests/test_instrumentation.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chrisguidry chrisguidry merged commit 433df9b into main Dec 9, 2025
31 checks passed
@chrisguidry chrisguidry deleted the immediate-heartbeat branch December 9, 2025 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants