Description
Expected Behavior
When a Workflow Task (WFT) is scheduled, it will wait indefinitely (modulus workflow timeout parameters) for a worker to come online without timing itself out. This is working as expected when the workflow has no sticky worker set.
Actual Behavior
After and despite ShutdownWorker
being called to indicate graceful worker shutdowns, if no workers are available at the time of WFT scheduling (such as after a timer firing event), a WFT failure with the now-shutdown sticky queue may be observed. After the WFT failure, the workflow will show a WFT being scheduled on the normal queue, which will wait indefinitely for a worker to become available, as usual.
The bug here is mostly cosmetic, as if there are other workers available on the normal queue, they will immediately receive the pending WFT. Still, a WFT failure should not be observable in a workflow's history if all workers are shut down gracefully.
Steps to Reproduce the Problem
- Start a single worker for a target task queue.
- Start a workflow that creates a timer firing after >10s.
- Kill the worker after the workflow's "Timer Started" event has been written (but before the timer fires event).
- Observe that, following the "Timer Fired" event, a failed WFT that appears to have been destined for the shutdown sticky queue will be observable in workflow history.