-
Notifications
You must be signed in to change notification settings - Fork 881
Open
Description
Implement SQL timer task cleanup for workflow and user timers
status: draft / WIP todo
Context
We're intending to cleanup for workflow deletion in the Cassandra/NoSQL persistence layer. When a workflow is deleted via DeleteHistoryEventTask, we now clean up:
- WorkflowTimerTaskInfo records (system-level timer tasks like workflow timeouts)
- User timer records (user-created timers via StartTimer API)
This cleanup is currently only implemented for NoSQL (Cassandra, DynamoDB, MongoDB). The SQL persistence layer still has TODO stubs that need to be implemented.
What needs to be done
1. Schema Changes
Add timer_task_id column to the timer_info table in SQL schema (similar to what we did for Cassandra in v0.46):
ALTER TABLE timer_info ADD COLUMN timer_task_id BIGINT;2. Update SQL Persistence Layer
Update common/persistence/sql/sql_execution_store.go:
- Implement
DeleteTimerTask()method (currently a TODO stub)- Should delete from the timer_tasks table where shard_id, task_id, and visibility_timestamp match
- Reference the Cassandra implementation in
common/persistence/nosql/nosqlplugin/cassandra/workflow.go:523-536
3. Update Serialization
- Add
timer_task_idto the SQL INSERT/UPDATE statements for timer_info - Add
timer_task_idto the SQL SELECT parsing for timer_info
4. Implementation References
The NoSQL implementation can serve as a reference:
- Timer cleanup logic:
service/history/task/timer_task_executor_base.go:357-412 - Cassandra DeleteTimerTask:
common/persistence/nosql/nosqlplugin/cassandra/workflow.go:523-536 - TaskID syncing:
service/history/shard/context.go:1324-1367
Success Criteria
- SQL schema migration created and tested
DeleteTimerTask()method implemented in SQL store- Timer task cleanup works for SQL-backed deployments
- Tests added/updated for SQL persistence layer
- No behavioral changes to workflow execution, only cleanup improvements
Notes
- This is a best-effort cleanup operation (errors are logged but not returned)
- The TaskID=0 initial value is fine; it gets assigned by the shard layer and synced back
- Timer tasks are already cleaned up when processed, so this just ensures orphaned records don't accumulate
Metadata
Metadata
Assignees
Labels
No labels