Skip to content

Fix race condition in task replacement where duplicate executions occur#151

Merged
chrisguidry merged 1 commit intomainfrom
fix-task-replacement-race-149
Aug 14, 2025
Merged

Fix race condition in task replacement where duplicate executions occur#151
chrisguidry merged 1 commit intomainfrom
fix-task-replacement-race-149

Conversation

@chrisguidry
Copy link
Owner

Fix race condition in task replacement where duplicate executions occur

Previously, docket.replace() could not cancel tasks that had already been moved from the queue to the stream by the scheduler, causing duplicate execution when the old task ran alongside the replacement. This happened because:

  1. Scheduler moves due tasks from queue → stream, assigning Redis message IDs
  2. replace() only checked the queue, missing tasks already in the stream
  3. Both old and new tasks would execute

This implements atomic task scheduling and cancellation using Lua scripts that track stream message IDs. The solution:

  • Atomic scheduling: New Lua script handles task existence checks, replacement cancellation, and scheduling in a single operation
  • Stream message ID tracking: When tasks move to stream, their message IDs are stored in the known task metadata
  • Atomic cancellation: Can delete tasks from stream using stored message IDs or from queue for scheduled tasks
  • Race-free replacement: Uses Redis locks per task key and atomic operations

Comprehensive test coverage includes race condition scenarios and idempotent operations.

Closes #149

🤖 Generated with Claude Code

Previously, `docket.replace()` could not cancel tasks that had already been moved from the queue to the stream by the scheduler, causing duplicate execution when the old task ran alongside the replacement. This happened because:

1. Scheduler moves due tasks from queue → stream, assigning Redis message IDs
2. `replace()` only checked the queue, missing tasks already in the stream
3. Both old and new tasks would execute

This implements atomic task scheduling and cancellation using Lua scripts that track stream message IDs. The solution:

- **Atomic scheduling**: New Lua script handles task existence checks, replacement cancellation, and scheduling in a single operation
- **Stream message ID tracking**: When tasks move to stream, their message IDs are stored in the known task metadata
- **Atomic cancellation**: Can delete tasks from stream using stored message IDs or from queue for scheduled tasks
- **Race-free replacement**: Uses Redis locks per task key and atomic operations

Comprehensive test coverage includes race condition scenarios and idempotent operations.

Closes #149

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions
Copy link

📚 Documentation has been built for this PR!

You can download the documentation directly here:
https://github.com/chrisguidry/docket/actions/runs/16966511465/artifacts/3765225764

@codecov-commenter
Copy link

codecov-commenter commented Aug 14, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (7a36609) to head (1f2d99e).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##              main      #151   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           31        31           
  Lines         4382      4411   +29     
  Branches       246       244    -2     
=========================================
+ Hits          4382      4411   +29     
Flag Coverage Δ
python-3.12 100.00% <100.00%> (ø)
python-3.13 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/docket/docket.py 100.00% <100.00%> (ø)
src/docket/worker.py 100.00% <ø> (ø)
tests/test_fundamentals.py 100.00% <100.00%> (ø)
tests/test_worker.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@abrookins
Copy link
Collaborator

Snap!

Copy link
Collaborator

@jakekaplan jakekaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥🔥🔥

@chrisguidry chrisguidry merged commit 9dc405c into main Aug 14, 2025
16 checks passed
@chrisguidry chrisguidry deleted the fix-task-replacement-race-149 branch August 14, 2025 16:12

class _schedule_task(Protocol):
async def __call__(
self, keys: list[str], args: list[str | float | bytes]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ... is a cool idea

chrisguidry added a commit that referenced this pull request Aug 18, 2025
…replacement (#154)"

This reverts commit 85f293c.

Revert "Fix WRONGTYPE error and add memory leak detection tests (#153)"

This reverts commit c2b8187.

Revert "Fix race condition in task replacement where duplicate executions occur (#151)"

This reverts commit 9dc405c.
@chrisguidry chrisguidry mentioned this pull request Aug 18, 2025
chrisguidry added a commit that referenced this pull request Aug 18, 2025
These fixes seem to get at least one of my systems into a state where
it's infinitely looping on perpetual tasks. This may have been due to
some of the intermediate problems in 0.9.0 and 0.9.1, but I don't want
to take that chance. We'll come back and revisit this.

This reverts commit 85f293c.

Revert "Fix WRONGTYPE error and add memory leak detection tests (#153)"

This reverts commit c2b8187.

Revert "Fix race condition in task replacement where duplicate
executions occur (#151)"

This reverts commit 9dc405c.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition in task replacement causes duplicate execution

4 participants