Skip to content

id_match: extend task matching to inactive tasks for globs and families#6920

Merged
hjoliver merged 3 commits intocylc:masterfrom
oliver-sanders:command-task-matching
Sep 3, 2025
Merged

id_match: extend task matching to inactive tasks for globs and families#6920
hjoliver merged 3 commits intocylc:masterfrom
oliver-sanders:command-task-matching

Conversation

@oliver-sanders
Copy link
Copy Markdown
Member

@oliver-sanders oliver-sanders commented Aug 13, 2025

  • Implement a single task ID matching interface which will satisfy all currently documented use cases / requirements.
  • Closes Command task matching requirements #5827. Allow IDs including families or globs to match against inactive tasks.
  • Closes cylc release: only need to match in the tasks-to-hold list #5750. Only match tasks to release against the "pool" of tasks which are actually held.
  • Addresses task globbing: add cli/graphql interface #4357 & cylc show for past and future tasks #5677 Abstract task matching from task pool objects. This simplifies its logic, but also unlocks the ability to add a command / GraphQL interface for listing IDs that match provided patterns. The remaining work required for this is outlined in load the workflow config cylc-uiserver#720.
  • Additionally:
    • Standardise task ID parsing, validation and standardisation and perform it upfront in cylc.flow.commands to ensure validation is not required in internal interfaces and sanitisation errors cannot occur there.
    • Add a subclass TaskTokens(Tokens) type which enforces the presence of cycle and task fields. This makes it easier to use Tokens objects in typed code.
    • Fix a __hash__ computation error for Tokens objects which caused erroneous inequality.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at Command task matching cylc-doc#867
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@oliver-sanders oliver-sanders added this to the 8.6.0 milestone Aug 13, 2025
@oliver-sanders oliver-sanders self-assigned this Aug 13, 2025
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bulk of the work is in this file (closes #5827), most of the rest of this PR is just interface changes.

Comment on lines +319 to +325
def is_tasks(ids: Iterable[str]) -> 'Set[TaskTokens]':
"""Ensure all IDs are task IDs and standardise them.

Examples:
# All legal
>>> is_tasks(['1/foo', '1/bar', '*/baz', '*/*'])
* Parses IDs.
* Filters out job ids and ensures at least the cycle point is provided.
* Standardises the cycle point format.
* Defaults the namespace to "root" unless provided.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this PR, all parsing, validating and standardising of IDs is done upfront at validate time.

There is no longer any re-parsing of IDs so errors can only crop up at validate time, not in the lower-level code.

@oliver-sanders
Copy link
Copy Markdown
Member Author

I couldn't implement the task matching for Cylc show (#5677) in this PR because it requires UIS changes, however, I have adapted the interface into something that can work independently of the task pool which unblocks this work, pending cylc/cylc-uiserver#720.

@oliver-sanders oliver-sanders force-pushed the command-task-matching branch 2 times, most recently from c846b0f to 11f0fb4 Compare August 14, 2025 13:45
@oliver-sanders
Copy link
Copy Markdown
Member Author

Implementing #5750 revealed a small logical flaw in the group-trigger logic which we had been getting away with up to this point.

Trigger has always overridden the held state, it still does for group-start tasks, but not necessarily for in-group tasks.

This commit extends group trigger to release any tasks which could prevent the group from running without manual intervention: 11f0fb4

full explanation for why hold has become an issue on this branch

This issue is pre-existing for the situation where inactive tasks were previously held.

However, the issue arose fresh on this branch for active tasks which were removed as part of the trigger operation.

The reason for this is that we were previously getting away with an "informal" hold by resetting the task state (but not backing this up in the DB):

if itask.state_reset(is_held=True):

As a result of #5750, only "formally" held tasks can be released. For internal logic reasons, this meant that restart/reload cleared the held status of these tasks (breakage detected by integration tests).

This issue was fixed by "formally" holding these tasks via the task pool, however, this broke the hack that had allowed us to get away without explicitly releasing these tasks for group trigger (because the held status was cleared by the remove).

itask.tokens['cycle'],
itask.tokens['task'],
itask.tokens['job'],
itask.submit_num,
Copy link
Copy Markdown
Member Author

@oliver-sanders oliver-sanders Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated fix spotted by MyPy as the result of type changes on this branch:

  • TaskProxy.tokens['job'] is always None (because it's a task proxy not a job proxy).
  • This caused the DB check to retrieve the latest job, which is probably what we want.
  • Made this explicit by passing through the actual job number.

@oliver-sanders oliver-sanders marked this pull request as ready for review August 14, 2025 15:21
@hjoliver
Copy link
Copy Markdown
Member

(Resolved some minor conflicts via GitHub UI, so the test could run).

@hjoliver
Copy link
Copy Markdown
Member

hjoliver commented Aug 26, 2025

This seems to work really well. 🚀

Need to update the ID matching help in:

  • scripts/cylc.py
  • option_parsers.py

@oliver-sanders oliver-sanders force-pushed the command-task-matching branch 2 times, most recently from 6682e45 to 532ceb3 Compare August 27, 2025 12:23
@oliver-sanders
Copy link
Copy Markdown
Member Author

(rebased)

@oliver-sanders
Copy link
Copy Markdown
Member Author

(rebased to pick up test fix)

@oliver-sanders oliver-sanders force-pushed the command-task-matching branch 2 times, most recently from 86c4b93 to 733c4b2 Compare August 29, 2025 09:24
@hjoliver
Copy link
Copy Markdown
Member

hjoliver commented Sep 1, 2025

(I'll likely approve this tomorrow, just need to spend a little more time on it).

# * n=1 a2 succeeded
# * n=0 b2 failed
# * n=1 c2 waiting
# * cycle 2
Copy link
Copy Markdown
Member

@wxtim wxtim Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point 3 would help the reader with the expected results below - Cycle 2 is ambiguous.

Copy link
Copy Markdown
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Copy link
Copy Markdown
Member

@wxtim wxtim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Read code changes.
  • Read tests (Made some suggestions, but none which ought to block merge).
  • Manually tested. Not found any problems caused by this PR, although independently re-discovered other bugs.
  • Manually tested whilst reading https://github.com/cylc/cylc-doc/pull/867/files
  • Saved my Gradmother from the ravenous bugblatter beast of Traal.

* Implement a single task ID matching interface which will satisfy all
  currently documented use cases / requirements.
* Closes cylc#5827.
  Allow IDs including families or globs to match against inactive tasks.
* Closes cylc#5750.
  Only match tasks to release against the "pool" of tasks which are
  actually held.
* Addresses cylc#4357 & cylc#5677
  Abstract task matching from task pool objects. This simplifies its
  logic, but also unlocks the ability to add a command / GraphQL
  interface for listing IDs that match provided patterns.
  The remaining work required for this is outlined in
  cylc/cylc-uiserver#720.
* Additionally:
  * Standardise task ID parsing, validation and standardisation and
    perform it upfront in `cylc.flow.commands` to ensure validation is
    not required in internal interfaces and sanitisation errors cannot
    occur there.
  * Add a subclass `TaskTokens(Tokens)` type which enforces the presence
    of cycle and task fields. This makes it easier to use `Tokens`
    objects in typed code.
  * Fix a `__hash__` computation error for `Tokens` objects which caused
    erroneous inequality.
* The trigger command has always overridden the held status.
* This still holds for group-start tasks, however:
  * In-group tasks which were previously held are not released (so will
    not run when their deps are met).
  * Tasks which are removed as part of group-trigger will get held as a
    side-effect of removal.
* This small modification releases in-group active tasks, as well as non
  start-tasks which were removed in order to ensure that trigger
  continues to trump held.
* `TaskProxy.tokens['job']` is always `None`.
* This caused the DB check to retrieve the latest job, which is probably
  what we want.
* Made this explicit by passing through the actual job number.
@oliver-sanders
Copy link
Copy Markdown
Member Author

(added more comments and resolved some above comments)

@hjoliver
Copy link
Copy Markdown
Member

hjoliver commented Sep 3, 2025

Two approvals, no important unresolved comments, unrelated test failures (swarm build mamba solve failed, due to slow download?) ... merging.

@hjoliver hjoliver merged commit b5bb6c2 into cylc:master Sep 3, 2025
18 of 21 checks passed
@oliver-sanders oliver-sanders deleted the command-task-matching branch September 3, 2025 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Command task matching requirements cylc release: only need to match in the tasks-to-hold list

3 participants