fix: resolve race condition in compound trigger evaluation#138
Conversation
Fixes two race conditions in compound trigger evaluation: 1. **Never-firing race** (transactional): When two child triggers fire concurrently in separate transactions, each only sees its own uncommitted insert due to READ COMMITTED isolation. Neither sees enough firings to trigger the parent. Fix: Use PostgreSQL advisory locks to serialize concurrent evaluations for the same compound trigger. 2. **Double-firing race** (autocommit): When both transactions see all firings, both delete and both fire the parent. Fix: Use DELETE ... RETURNING to make clearing a claim operation. Only the worker that successfully deletes the expected firings proceeds; others bail out. Based on the fix in PrefectHQ/nebula#10716. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Code Review by Qodo
1. Missing future annotations import
|
| async def acquire_composite_trigger_lock( | ||
| session: AsyncSession, | ||
| trigger: CompositeTrigger, | ||
| ) -> None: |
There was a problem hiding this comment.
1. Missing future annotations import 📘 Rule violation ✓ Correctness
• src/prefect/server/events/models/composite_trigger_child_firing.py contains type annotations but does not include from __future__ import annotations as the first import. • This violates the typing import standard required for src/ files and can lead to poorer type-checking performance and forward-reference issues. • Add the future import at the top of the file before any other imports.
Agent prompt
## Issue description
`src/prefect/server/events/models/composite_trigger_child_firing.py` uses type annotations but is missing `from __future__ import annotations` as the first import, violating the required typing standard.
## Issue Context
The file contains annotated function signatures (e.g., `session: AsyncSession`, `trigger: CompositeTrigger`) and return types, but it begins with regular imports.
## Fix Focus Areas
- src/prefect/server/events/models/composite_trigger_child_firing.py[1-20]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| import logging | ||
|
|
||
| logger: "logging.Logger" = get_logger(__name__) | ||
| logger = logging.getLogger(__name__) |
There was a problem hiding this comment.
2. Nonstandard logger initialization 📘 Rule violation ✧ Quality
• src/prefect/server/events/triggers.py initializes logger with logging.getLogger(__name__)
and without the required type annotation.
• This violates the standardized logger initialization pattern and can break consistency/type-safety
expectations across the codebase.
• Replace it with the required logger: "logging.Logger" = get_logger("module_name") pattern.
Agent prompt
## Issue description
`src/prefect/server/events/triggers.py` uses `logging.getLogger(__name__)` for `logger`, violating the required standardized logger initialization and typing pattern.
## Issue Context
The module already imports `get_logger` but does not use it for `logger` initialization.
## Fix Focus Areas
- src/prefect/server/events/triggers.py[27-27]
- src/prefect/server/events/triggers.py[71-76]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| if dialect.name == "postgresql": | ||
| # Use the trigger's UUID as the lock key | ||
| # pg_advisory_xact_lock takes a bigint, so we use the UUID's int representation | ||
| # truncated to fit (collision is extremely unlikely and benign) | ||
| lock_key = hash(str(trigger.id)) % (2**63) | ||
| await session.execute( | ||
| sa.text("SELECT pg_advisory_xact_lock(:key)"), {"key": lock_key} | ||
| ) |
There was a problem hiding this comment.
3. Unstable advisory lock key 🐞 Bug ⛯ Reliability
• The advisory lock key is computed with Python’s built-in hash(), which is process-randomized; different workers/processes will compute different keys for the same trigger id. • In multi-worker mode this undermines cross-process serialization, so the original race can still occur (only intra-process serialization is achieved). • This is especially important because Prefect explicitly supports multi-worker server mode against Postgres.
Agent prompt
### Issue description
`acquire_composite_trigger_lock` uses `hash(str(trigger.id))` to derive a Postgres advisory lock key. Python’s `hash()` is process-randomized, so different workers will compute different keys for the same trigger UUID, defeating cross-process serialization.
### Issue Context
This lock is intended to serialize concurrent composite-trigger evaluations across workers.
### Fix Focus Areas
- src/prefect/server/events/models/composite_trigger_child_firing.py[38-45]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| ) -> set[UUID]: | ||
| """ | ||
| Delete the specified child firings and return the IDs that were actually deleted. | ||
|
|
||
| Returns the set of child_firing_ids that were successfully deleted. Callers can | ||
| compare this to the expected firing_ids to detect races and avoid double-firing | ||
| composite triggers. | ||
| """ | ||
| result = await session.execute( | ||
| sa.delete(db.CompositeTriggerChildFiring) | ||
| .filter( | ||
| db.CompositeTriggerChildFiring.automation_id == trigger.automation.id, | ||
| db.CompositeTriggerChildFiring.parent_trigger_id == trigger.id, | ||
| db.CompositeTriggerChildFiring.child_firing_id.in_(firing_ids), | ||
| ) | ||
| .returning(db.CompositeTriggerChildFiring.child_trigger_id) | ||
| ) |
There was a problem hiding this comment.
4. Wrong returning column 🐞 Bug ✧ Quality
• clear_child_firings claims to return deleted child_firing_ids, but the DELETE..RETURNING currently returns child_trigger_id. • This makes the returned IDs (and the debug log field deleted_firing_ids) misleading, reducing debuggability and risking future misuse if callers rely on the returned values. • The current race-detection length check may “work by accident” (one row per child trigger) but the values are still wrong for the API promised by the docstring/log fields.
Agent prompt
### Issue description
`clear_child_firings` documents returning `child_firing_ids` and the caller logs them as such, but the DELETE statement returns `child_trigger_id`.
### Issue Context
The returned IDs are used for debug logging and may be used by future callers.
### Fix Focus Areas
- src/prefect/server/events/models/composite_trigger_child_firing.py[140-155]
- src/prefect/server/events/triggers.py[390-408]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Benchmark PR from qodo-benchmark#543