Skip to content

🚧 update thermorawfileparser#26

Merged
ypriverol merged 5 commits intobigbio:mainfrom
biosustain:updt_thermorawfileparser
Mar 20, 2026
Merged

🚧 update thermorawfileparser#26
ypriverol merged 5 commits intobigbio:mainfrom
biosustain:updt_thermorawfileparser

Conversation

@enryH
Copy link
Contributor

@enryH enryH commented Mar 13, 2026

Pull Request

Update thermorawfileparser to match strict syntax and other updates to nf-core standards

  • adding in when, prefix and suffix
  • use prefix, params and suffix in scripts

Description

Checklist

  • Module follows nf-core standards
  • main.nf includes process definition
  • meta.yml includes complete documentation
  • environment.yml specifies dependencies
  • Tests are included
  • Code is formatted (prettier)
  • CI checks pass

Module Type

  • New module
  • Module update
  • Bug fix
  • Documentation

Related Issues

Closes #

Summary by CodeRabbit

  • New Features

    • Outputs now include additional spectra formats (including gzipped variants), dynamic file naming, and improved conversion logs.
    • Structured tool-version reporting is bundled with conversion outputs.
    • Conversion runs only when explicitly enabled or left unspecified.
  • Chores

    • Simplified staging/retry behavior for more predictable execution.
    • Removed module environment name declaration.
  • Documentation

    • Module metadata and input/output schema clarified; authors/maintainers updated.
  • Tests

    • Tests updated to expect renamed outputs and new version reporting.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

ThermoRawFileParser Nextflow module: process tag/input switched to ${meta.id} and path(raw); conditional stage-in removed; added when guard; outputs renamed/expanded to spectra including gz variants; added versions_thermorawfileparser emit; script/stub compute prefix/suffix for filenames.

Changes

Cohort / File(s) Summary
Process implementation
modules/bigbio/thermorawfileparser/main.nf
Changed process tag/input to ${meta.id} / path(raw); removed stageInMode conditional; added `when: task.ext.when == null
Conda environment
modules/bigbio/thermorawfileparser/environment.yml
Removed name key; retained channels and bioconda::thermorawfileparser=1.4.5 dependency.
Module metadata
modules/bigbio/thermorawfileparser/meta.yml
Restructured inputs to meta + raw; replaced convert_files output with structured spectra (expanded extensions incl. gz); added versions_thermorawfileparser output using an eval version capture; updated tool metadata and authors/maintainers.
Tests
modules/bigbio/thermorawfileparser/tests/main.nf.test
Updated assertions to reference process.out.versions_thermorawfileparser and process.out.spectra[...]; adjusted stub input (mzml_id: 'test_sample') and added a full process.out snapshot assertion.

Sequence Diagram(s)

(Skipped — changes are focused on process IO/metadata adjustments and do not introduce a multi-component sequential flow requiring visualization.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • refine #18 — related output-pattern changes adding parquet and gzip variants and corresponding test updates.
  • First iteration for thermorawfileparser #2 — overlaps edits to the ThermoRawFileParser Nextflow process (inputs/outputs, staging, version capture, stub/script naming).

Suggested labels

Review effort 3/5

Suggested reviewers

  • jpfeuffer
  • daichengxin

Poem

🐰 I nibble prefixes, suffixes bright,

gzip tucked in for a cozy night,
Guards that say when let processes run,
Versions whisper softly when all is done,
Hoppity pipelines — conversion done right!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title uses vague language ('update') and includes a construction emoji without specifying what changes were made to thermorawfileparser. Replace with a more descriptive title like 'Update thermorawfileparser process with when guard, version output, and gzip support' to clearly convey the main changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ypriverol ypriverol marked this pull request as ready for review March 13, 2026 14:12
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/bigbio/thermorawfileparser/main.nf`:
- Line 19: Remove the trailing whitespace present in modules' main.nf (the
top-level pipeline file main.nf) that is causing CI to fail: open main.nf and
delete the extra space characters at the end of the blank/short line near the
beginning so the line contains only the newline (no spaces or tabs), then save
and re-run CI.
- Around line 27-33: The computed prefix and suffix (variables prefix, suffix,
using task.ext.prefix and args) are never applied to the produced filenames or
command, and suffix detection misses -f=<n> forms and gzipped outputs; update
the process to (1) accept both "-f N"/"-f=N" and "--format N"/"--format=N" when
resolving suffix in the suffix logic, (2) use prefix and suffix when
constructing the output filename in the command invocation (the same symbol
names prefix and suffix so wiring is obvious), and (3) update the output
declaration/glob to reference the generated filename pattern and include the
optional ".gz" variant (or use the computed suffix variable) so gzip scenarios
are matched by outputs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 39748274-c88f-453f-bf09-3fbe7a7281c7

📥 Commits

Reviewing files that changed from the base of the PR and between 6aa32c0 and 9254973.

⛔ Files ignored due to path filters (1)
  • modules/bigbio/thermorawfileparser/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (1)
  • modules/bigbio/thermorawfileparser/main.nf

Comment on lines +27 to +33
def prefix = task.ext.prefix ?: "${meta.id}"
def suffix = args.contains("--format 0") || args.contains("-f 0") ? "mgf" :
args.contains("--format 1") || args.contains("-f 1") ? "mzML" :
args.contains("--format 2") || args.contains("-f 2") ? "mzML" :
args.contains("--format 3") || args.contains("-f 3") ? "parquet" :
"mzML"
suffix = args.contains("--gzip")? "${suffix}.gz" : "${suffix}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Dynamic prefix/suffix handling is currently disconnected from produced outputs.

At Line 27-33, prefix/suffix are computed but never used by the actual command (Line 36) or output declaration (Line 16). This makes the new logic effectively dead and can break gzip scenarios (*.gz not matched by current output glob). Also, format detection here doesn’t parse documented -f=<n> forms from modules/bigbio/thermorawfileparser/meta.yml:8-16, so suffix resolution can drift from real tool behavior.

Suggested parser hardening for format/gzip detection
-    def suffix = args.contains("--format 0") || args.contains("-f 0") ? "mgf" :
-                args.contains("--format 1") || args.contains("-f 1") ? "mzML" :
-                args.contains("--format 2") || args.contains("-f 2") ? "mzML" :
-                args.contains("--format 3") || args.contains("-f 3") ? "parquet" :
-                "mzML"
-    suffix = args.contains("--gzip")? "${suffix}.gz" : "${suffix}"
+    def suffix = (args =~ /(?:^|\s)(?:--format|-f)(?:=|\s+)0(?:\s|$)/).find() ? "mgf" :
+                (args =~ /(?:^|\s)(?:--format|-f)(?:=|\s+)1(?:\s|$)/).find() ? "mzML" :
+                (args =~ /(?:^|\s)(?:--format|-f)(?:=|\s+)2(?:\s|$)/).find() ? "mzML" :
+                (args =~ /(?:^|\s)(?:--format|-f)(?:=|\s+)3(?:\s|$)/).find() ? "parquet" :
+                "mzML"
+    def gzip = (args =~ /(?:^|\s)--gzip(?:\s|$)/).find()
+    suffix = gzip ? "${suffix}.gz" : suffix

You should also wire prefix/suffix into the produced filename contract (or remove them) and make output patterns include gzipped variants when gzip is enabled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/bigbio/thermorawfileparser/main.nf` around lines 27 - 33, The
computed prefix and suffix (variables prefix, suffix, using task.ext.prefix and
args) are never applied to the produced filenames or command, and suffix
detection misses -f=<n> forms and gzipped outputs; update the process to (1)
accept both "-f N"/"-f=N" and "--format N"/"--format=N" when resolving suffix in
the suffix logic, (2) use prefix and suffix when constructing the output
filename in the command invocation (the same symbol names prefix and suffix so
wiring is obvious), and (3) update the output declaration/glob to reference the
generated filename pattern and include the optional ".gz" variant (or use the
computed suffix variable) so gzip scenarios are matched by outputs.

@enryH enryH changed the title 🚧 adding in when, prefix and suffix 🚧 update thermorawfileparser Mar 16, 2026
@ypriverol ypriverol requested a review from Copilot March 17, 2026 07:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the bigbio/thermorawfileparser Nextflow module toward nf-core “strict syntax” alignment, and refreshes nf-test snapshots to reflect updated tooling metadata.

Changes:

  • Added a when: gate (task.ext.when) to allow conditional execution of the process.
  • Introduced (currently unused) prefix / suffix variables intended for dynamic output naming and gzip handling.
  • Updated nf-test snapshot metadata (versions / versions_stub) to newer nf-test / nextflow versions and formatting.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
modules/bigbio/thermorawfileparser/main.nf Adds when: gating and begins implementing prefix/suffix + format/gzip detection logic.
modules/bigbio/thermorawfileparser/tests/main.nf.test.snap Updates stored snapshot structure/metadata for versions outputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 25 to 26
// Default to indexed mzML format (-f=2) if not specified in args
def formatArg = args.contains('-f=') ? '' : '-f=2'
Comment on lines +20 to +21
when:
task.ext.when == null || task.ext.when
- update meta.yml
- align with nf-core/thermorawfileparser configuration
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
modules/bigbio/thermorawfileparser/main.nf (2)

44-52: Duplicated prefix/suffix logic in stub.

The format detection logic is repeated verbatim from the script section (lines 27-33). While Nextflow's script/stub separation makes sharing code difficult, consider extracting this to a shared function or ext directive if this pattern grows more complex.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/bigbio/thermorawfileparser/main.nf` around lines 44 - 52, The
suffix/prefix/formatArg determination (using args, formatArg, prefix, suffix and
task.ext / meta.id) is duplicated between the stub and script; extract this
logic into a single reusable helper (e.g., an ext helper function like
getFormatSuffix(args) or a shared ext properties block on task.ext) and call
that helper from both places instead of repeating the conditional chain,
ensuring the helper implements the same checks for "-f=" and "--format N" and
"--gzip" and falls back to "${meta.id}" for prefix when needed.

58-61: Stub creates versions.yml but it's not declared as an output.

The stub generates versions.yml, but versions are now captured via the eval() topic channel (line 17). This file won't be collected and may be dead code. Consider removing it or adding a corresponding output declaration if it's needed for compatibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/bigbio/thermorawfileparser/main.nf` around lines 58 - 61, The stub
writes a versions.yml file that isn't declared as an output and is redundant
because versions are emitted via the eval() topic channel; either remove the
heredoc that creates "versions.yml" (the block referencing "${task.process}":
ThermoRawFileParser: $(ThermoRawFileParser.sh --version)) or, if you need to
keep it for compatibility, add "versions.yml" to the process outputs so Nextflow
will collect it (update the process outputs in the same process definition that
contains task.process and ThermoRawFileParser).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/bigbio/thermorawfileparser/main.nf`:
- Line 16: The tests still reference the old emitted channel name convert_files;
update the assertions in modules/bigbio/thermorawfileparser/tests/main.nf.test
to use the renamed channel spectra by replacing process.out.convert_files[0][1]
with process.out.spectra[0][1] at both assertion locations (previously on lines
~26 and ~49), ensuring the test uses the tuple val(meta), path(...), emit:
spectra emission defined in main.nf.

In `@modules/bigbio/thermorawfileparser/meta.yml`:
- Line 19: The tool_dev_url value in meta.yml contains a stray trailing
double-quote that will corrupt parsing; open
modules/bigbio/thermorawfileparser/meta.yml and remove the extra quote at the
end of the URL value for the tool_dev_url key (ensure the line reads
tool_dev_url: https://github.com/compomics/ThermoRawFileParser without the
trailing ").

---

Nitpick comments:
In `@modules/bigbio/thermorawfileparser/main.nf`:
- Around line 44-52: The suffix/prefix/formatArg determination (using args,
formatArg, prefix, suffix and task.ext / meta.id) is duplicated between the stub
and script; extract this logic into a single reusable helper (e.g., an ext
helper function like getFormatSuffix(args) or a shared ext properties block on
task.ext) and call that helper from both places instead of repeating the
conditional chain, ensuring the helper implements the same checks for "-f=" and
"--format N" and "--gzip" and falls back to "${meta.id}" for prefix when needed.
- Around line 58-61: The stub writes a versions.yml file that isn't declared as
an output and is redundant because versions are emitted via the eval() topic
channel; either remove the heredoc that creates "versions.yml" (the block
referencing "${task.process}": ThermoRawFileParser: $(ThermoRawFileParser.sh
--version)) or, if you need to keep it for compatibility, add "versions.yml" to
the process outputs so Nextflow will collect it (update the process outputs in
the same process definition that contains task.process and ThermoRawFileParser).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 235291df-b05f-481f-bd2e-3a7210b46920

📥 Commits

Reviewing files that changed from the base of the PR and between 9254973 and 1d7d5c3.

📒 Files selected for processing (3)
  • modules/bigbio/thermorawfileparser/environment.yml
  • modules/bigbio/thermorawfileparser/main.nf
  • modules/bigbio/thermorawfileparser/meta.yml
💤 Files with no reviewable changes (1)
  • modules/bigbio/thermorawfileparser/environment.yml

- files are not staged locally for some reason (probably IP blocked at the University?)
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
modules/bigbio/thermorawfileparser/tests/main.nf.test (1)

38-49: ⚠️ Potential issue | 🔴 Critical

Test will fail: meta.id doesn't match expected output filename.

The stub uses meta.id as the prefix (per main.nf line 51: def prefix = task.ext.prefix ?: "${meta.id}"). Here meta.id is 'test', so the stub will create test.mzML, but line 49 asserts 'test_sample.mzML'.

Either fix the input to use id: 'test_sample' or update the assertion to expect 'test.mzML'.

🐛 Proposed fix (option 1: fix the input)
 input[0] = [
-    [ id: 'test', mzml_id: 'test_sample' ],
+    [ id: 'test_sample', mzml_id: 'test_sample' ],
     file(params.test_data['proteomics']['msspectra']['ups1_50amol_r3'], checkIfExists: false)
 ]
🐛 Proposed fix (option 2: fix the assertion)
-            assert new File(process.out.spectra[0][1]).name == 'test_sample.mzML'
+            assert new File(process.out.spectra[0][1]).name == 'test.mzML'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/bigbio/thermorawfileparser/tests/main.nf.test` around lines 38 - 49,
The test input uses meta.id 'test' but the assertion expects filename
'test_sample.mzML', causing mismatch because main.nf uses def prefix =
task.ext.prefix ?: "${meta.id}" (so output becomes test.mzML); fix by making the
input id match the expected filename (change id to 'test_sample' in the input
block) or alternatively update the assertion to expect 'test.mzML' (adjust the
assert new File(process.out.spectra[0][1]).name accordingly) — edit either the
input id or the expected filename to make meta.id and the assertion consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@modules/bigbio/thermorawfileparser/tests/main.nf.test`:
- Around line 38-49: The test input uses meta.id 'test' but the assertion
expects filename 'test_sample.mzML', causing mismatch because main.nf uses def
prefix = task.ext.prefix ?: "${meta.id}" (so output becomes test.mzML); fix by
making the input id match the expected filename (change id to 'test_sample' in
the input block) or alternatively update the assertion to expect 'test.mzML'
(adjust the assert new File(process.out.spectra[0][1]).name accordingly) — edit
either the input id or the expected filename to make meta.id and the assertion
consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae5a4db4-50a0-47c7-a79b-75a4927005c2

📥 Commits

Reviewing files that changed from the base of the PR and between 1d7d5c3 and 47ebbd0.

⛔ Files ignored due to path filters (1)
  • modules/bigbio/thermorawfileparser/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (2)
  • modules/bigbio/thermorawfileparser/main.nf
  • modules/bigbio/thermorawfileparser/tests/main.nf.test
✅ Files skipped from review due to trivial changes (1)
  • modules/bigbio/thermorawfileparser/main.nf

enryH added 2 commits March 20, 2026 13:02
- switched to id in module tag
- quantms uses mzml_id, but there is no id otherwise (so can be swapped out)
@ypriverol ypriverol self-requested a review March 20, 2026 13:18
@ypriverol ypriverol merged commit 0318e0a into bigbio:main Mar 20, 2026
8 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Mar 23, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants