Group trigger: re-use already-completed outputs of active group start tasks.#6910
Conversation
0f82dea to
cbdf66b
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
cbdf66b to
271d6df
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
25b4871 to
05b439f
Compare
|
Rebased to 8.5.x and labelled as a bug fix (which it really is). @oliver-sanders - this was easier than anticipated because rather than respawning on the already-completed outputs we can just set the corresponding prerequisites on in-group downstream tasks (thus naturally avoiding off-group effects). |
|
I like what you did with the test. |
cylc/flow/scripts/trigger.py
Outdated
| How live (preparing, submitted, or running) tasks are handled: | ||
| * Live in-group tasks are killed and removed, to rerun in the triggered flow. | ||
| * Live group-start tasks are left alone; they don't need to be retriggered. |
There was a problem hiding this comment.
Technical and correct, but I don't think users are going to make much sense of this.
I would recommend explaining group trigger in terms of its behaviours rather than the details of its implementation.
Here's how I explained it in the changes section:
- Prerequisites on any tasks that are outside of the group of tasks being triggered are automatically satisfied.
- Any tasks which have already run within the group will be automatically removed (i.e. cylc remove) to allow them to be re-run without intervention.
- Any preparing, submitted or running tasks within the group will also be removed if necessary to allow the tasks to re-run in order.
This avoids using the terms "gruop-start", "in-group", "flow" and "run history" but is much more likely to be read and understood.
There was a problem hiding this comment.
I'll put a general note on approaches to documentation on Element, but for now:
I don't really agree that terms like "in-group" are technical in this context: i.e., in a section entitled "triggering a group of tasks at once", and where "group" here is perfectly compatible with normal colloquial usage of the word.
And we should be using "flow" to describe (intuitively, not technically) workflow activity flowing through the graph. Without a good name for concepts like this we end up having to repeat verbose descriptions all over the place.
That said, in this case maybe I have overcooked it, and I do quite like your description, so I'm considering a re-do ...
There was a problem hiding this comment.
One to take off issue.
I don't really agree that terms like "in-group" are technical [...] in a section entitled "triggering a group of tasks at once"
I agree! These aren't especially technical terms so long as they are contextualised in a section as you say.
But if we can explain the behaviours more simply (without using these terms), we should.
And we should be using "flow" to describe (intuitively, not technically) workflow activity flowing through the graph.
I understand why you want to use the term "flow" and convey the flow model in documentation.
However, the "flow" term is only really necessary to explain concurrent flows. I don't think it is helpful outside of this context, we can explain single-flow behaviour more clearly without this term/model (just like we did at Cylc 7).
Concurrent flows are an advanced feature, the terminology and models associated with it should be constrained to the documentation for concurrent flows (we asked for this at proposal time). The default single-flow cases can and should be more simply explained without invoking this model.
There was a problem hiding this comment.
I understand why you want to use the term "flow" and convey the flow model in documentation. ...
That's really not what I'm doing. I certainly agree that our recent intervention improvements have relegated concurrent flows to an advanced feature that won't need to be used much. By "flow" I simply mean when I trigger a task (or set its prerequisites or whatever) activity "flows on from" that intervention through the graph. That's very much an intuitive concept, and IMO it's the most natural way to explain what Cylc does. Even with the new group trigger feature, which erases run history so that a "concurrent flow" is not needed, the activity that results from the trigger still "flows" through the graph in this way and as such is most simply described as triggering a flow.
There was a problem hiding this comment.
By "flow" I simply mean when I trigger a task (or set its prerequisites or whatever) activity "flows on from" that intervention through the graph
I understand this, and I do get where you are coming from. Of course "flow on" is reasonable, but there shouldn't be any need to talk about "flows" outside of concurrent flows IMO.
Outside of concurrent flows, I don't think the "flow" model / terminology is particularly helpful and can make docs confusing for users. We can explain Cylc things clearly without introducing this term (e.g. we don't use it once in the tutorial).
oliver-sanders
left a comment
There was a problem hiding this comment.
Code makes sense, tested as working, 👍.
|
@wxtim - if you approve today, don't merge - I will try to dumb down the CLI help a bit first 😁 |
4c16093 to
683a8f1
Compare
683a8f1 to
c22dbc7
Compare
|
OK I did CLI help rewrite from scratch, but based on your description @oliver-sanders - and IMO managed to pare it down to the essentials and improve clarity. Annotated to explain my thinking:
Refers to "outside of the group", but "group" is now defined in the first line (and it applies to both to single and multiple tasks).
Note that "removed if necessary to allow rerun" covers all bases; but I explicitly that some may be killed (because killing tasks is important):
(In the interests of describing things simply to users, we probably don't need to give those details explicitly)
(This also covers the previous separate mention of triggering single tasks when paused, because group is defined as "one or more" tasks in the top line).
This is a standalone point, so I moved it down the page, out of the way of the above bits which are all more or less inter-related. |
|
(N.B. two approvals already, but we need to agree on the trigger CLI help text). |
| Trigger a group of one or more tasks, respecting dependencies among them. | ||
|
|
||
| Prerequisites on tasks outside of the group will be satisfied automatically. | ||
|
|
||
| Tasks will be removed if necessary to allow re-run without intervention, so | ||
| triggered tasks that are preparing, submitted, or running may be killed. | ||
|
|
||
| Tasks that lead into a group will run immediately even if the workflow is | ||
| paused; activity will flow on from them once the workflow is resumed. | ||
|
|
||
| Triggering an unqueued task queues it; triggering a queued task runs it. | ||
|
|
||
| How flow numbers are assigned to triggered tasks: | ||
| Active tasks (n=0) already have assigned flows; inactive tasks (n>0) do not. |
Close #6858
Re-spawn in-group children of already-completed group start task outputs - because triggering is event-driven and group trigger removes in-group tasks and their already-completed prerequisites.
Check List
CONTRIBUTING.mdand added my name as a Code Contributor.setup.cfg(andconda-environment.ymlif present).?.?.xbranch.