fix(lint): skip correct_meta_outputs for dynamic path() values#4226
fix(lint): skip correct_meta_outputs for dynamic path() values#4226pinin4fjords wants to merge 3 commits into
Conversation
`correct_meta_outputs` does static string comparison between main.nf
output paths and meta.yml entries. Modules that compute output globs
in the script: block via a `def reads_glob = ...` then reference
`path(reads_glob)` or `path("\${reads_glob}")` in the output: block
were flagged as mismatches because the lint extracts the literal
variable name (or GString) rather than resolving it.
Detect output channels whose `path(...)` argument is a bare Groovy
identifier or pure `\${var}` GString and drop those channels from both
sides of the comparison. Static globs and partial GStrings (`*.bam`,
`\${prefix}_fastqc.html`) are unaffected.
Use case: nf-core/modules/trimgalore where the `reads` glob has to
be conditional on `meta.single_end` to distinguish PE intermediate
files from SE final outputs (negative lookbehind required, not
expressible as a static glob).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Rename DYNAMIC_PATH_RE to _DYNAMIC_PATH_RE (private) - Split _channels_with_dynamic_paths into a thin set comprehension plus _has_dynamic_path so the early-return correctly skips remaining elements once a channel is flagged - Use next(iter(entry)) + entry[key] (avoids items() view allocation) - Hoist test-only import to module top - Drop redundant assertion message and what-only docstring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`NFCoreComponent.outputs` is declared as `list[str]` (the subworkflow shape) so mypy rejects passing it where a `dict` is expected, even though for modules it's always a dict. Widen the parameter to `dict | list` and short-circuit on list (subworkflows have no `_keyword` info). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
An alternative would be to emit both globs with separate outputs and then the module user can mix the outputs together into a new channel as and when needed? Advantages: retains parsability, doesn't require moving or renaming files |
This is what I would have done. There would also be a defensive file test in the code to make sure one or the other is present if it's either-or ls ${prefix}_trimmed.fq.gz ${prefix}_{1,2}_val_{1,2}.fq.gz >/dev/null 2>&1 || { echo "ERROR: No matching FASTQ files found." >&2; exit 1; } |
|
OK, I'm not going to do that for this specific case, it would require a bunch of rewiring in subworkflows and Felix's Rust Trimgalore rewrite (next week) negates the need. But fair enough on the general pattern. |
|
Closing - the trimgalore-specific motivation has dropped (the producing module change is no longer being pursued). If a future module case reintroduces the need for dynamic path() arguments to pass correct_meta_outputs, this can be reopened. |
RFC: should
correct_meta_outputsallow dynamicpath()arguments?Today
correct_meta_outputsdoes literal string comparison betweenpath(...)arguments in main.nf and the keys in meta.yml. This means the only legal output declaration is a single static glob (or partial-GString) that meta.yml can mirror exactly.That works for almost every module, but it forces output globs to be expressible as one static pattern. For modules where the right glob is conditional on input meta — e.g.:
— there's no static glob that works without either matching files the channel shouldn't carry, or silently dropping legitimate ones (bash globs have no negative lookbehind). The triggering example is nf-core/modules#11317 (trimgalore, distinguishing PE intermediate
*_<N>_trimmed.fq.gzfrom SE final*_trimmed.fq.gz).Proposal
When the entire
path()argument is just a variable reference —path(reads_glob)orpath("${reads_glob}")— skip that channel from the meta.yml comparison rather than failing it. Recognised via:Anchored, so partial GStrings (
${prefix}.bam,${prefix}_fastqc.html,${prefix}*.fq.gz) and static globs are unaffected and keep validating literally.val(...)andeval(...)are unaffected regardless of their argument shape.What's lost
For channels declared this way, lint can no longer verify meta.yml documents the right pattern — meta.yml could be wrong and we wouldn't catch it.
Why not "resolve and check the set"
I considered parsing the
script:block to expandreads_globto its possible values and comparing each against meta.yml. Two problems:So a proper fix would also need a meta.yml schema RFC (alternatives per channel position, or per-mode output blocks). That's bigger than this PR.
Question for review
path()fromcorrect_meta_outputs" an acceptable position pending a schema-level fix?If the answer is the second, this PR should be closed and nf-core/modules#11317 replaced with nf-core/modules#11315 (subdir+mv).
Test plan
path(...)entries with bare-identifier or pure-${var}arguments are flagged;val(...)/eval(...)are unaffected.tests/modules/lint/test_meta_yml.pypasses (10/10; the 1 skipped is a pre-existing dev failure).correct_meta_outputswith this patch.🤖 Generated with Claude Code