Skip to content

implement SQL timer task cleanup for workflow and user timers #7569

@davidporter-id-au

Description

@davidporter-id-au

Implement SQL timer task cleanup for workflow and user timers

status: draft / WIP todo

Context

We're intending to cleanup for workflow deletion in the Cassandra/NoSQL persistence layer. When a workflow is deleted via DeleteHistoryEventTask, we now clean up:

  1. WorkflowTimerTaskInfo records (system-level timer tasks like workflow timeouts)
  2. User timer records (user-created timers via StartTimer API)

This cleanup is currently only implemented for NoSQL (Cassandra, DynamoDB, MongoDB). The SQL persistence layer still has TODO stubs that need to be implemented.

What needs to be done

1. Schema Changes

Add timer_task_id column to the timer_info table in SQL schema (similar to what we did for Cassandra in v0.46):

ALTER TABLE timer_info ADD COLUMN timer_task_id BIGINT;

2. Update SQL Persistence Layer

Update common/persistence/sql/sql_execution_store.go:

  • Implement DeleteTimerTask() method (currently a TODO stub)
    • Should delete from the timer_tasks table where shard_id, task_id, and visibility_timestamp match
    • Reference the Cassandra implementation in common/persistence/nosql/nosqlplugin/cassandra/workflow.go:523-536

3. Update Serialization

  • Add timer_task_id to the SQL INSERT/UPDATE statements for timer_info
  • Add timer_task_id to the SQL SELECT parsing for timer_info

4. Implementation References

The NoSQL implementation can serve as a reference:

  • Timer cleanup logic: service/history/task/timer_task_executor_base.go:357-412
  • Cassandra DeleteTimerTask: common/persistence/nosql/nosqlplugin/cassandra/workflow.go:523-536
  • TaskID syncing: service/history/shard/context.go:1324-1367

Success Criteria

  • SQL schema migration created and tested
  • DeleteTimerTask() method implemented in SQL store
  • Timer task cleanup works for SQL-backed deployments
  • Tests added/updated for SQL persistence layer
  • No behavioral changes to workflow execution, only cleanup improvements

Notes

  • This is a best-effort cleanup operation (errors are logged but not returned)
  • The TaskID=0 initial value is fine; it gets assigned by the shard layer and synced back
  • Timer tasks are already cleaned up when processed, so this just ensures orphaned records don't accumulate

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions