-
Notifications
You must be signed in to change notification settings - Fork 95
Support pre-removal of tasks to handle a future stall? #7249
Description
(Follow-on from #6383 [closed by #7248] which adds future final-incomplete tasks to n=0 for visibility.)
If I set a future task to (e.g.) failed or expired, such that it is in a final-incomplete state, the flow will eventually stall there.
One way of dealing with such a stall is to cylc remove the final-incomplete task - i.e., I say we need not continue the flow at that point, and I accept the downstream consequences of cutting it off there. In fact that is likely what I want to do if I've manually set the task to a final incomplete state (especially expired, which literally means there is no need to run the task anymore).
If I know in advance that the stall is coming, and I know that I will just remove the blocking task when it does come, I should be able to handle it in advance in the same way (without waiting till midnight Saturday or whatever).
Currently I can't do that, because removing the task ahead of the flow erases its history, so it will respawn when the flow gets there.
What to do about this?
We could have a cylc remove option to remove a task from n=0 without erasing its history, so that the scheduler knows it has been removed and should NOT respawn it in the same flow.
This is admittedly somewhat dangerous - pre-removing the final-incomplete task from n=0 may remove it from view in the UIs, which (at least until the current n-window extent covers it) means no visibility of the fact that the workflow may just shut down early when the flow reaches that task.
Regardless of that, IMO it is inarguable that I should be able to handle an impending stall before it happens, if I know it is impending and I know how I want to handle it, so we need some way to manage this sort of scenario.
A safe way to do it might be something like this: cylc remove --on-stall leaves the final-incomplete task in n=0 for now, but automatically removes it when/if it causes a stall? Possibly with some kind of attribute that marks the task as pre-removed?