Skip to content

Update the qpx module container version #38

Merged
ypriverol merged 11 commits into
mainfrom
dev
May 7, 2026
Merged

Update the qpx module container version #38
ypriverol merged 11 commits into
mainfrom
dev

Conversation

@ypriverol
Copy link
Copy Markdown
Member

@ypriverol ypriverol commented May 7, 2026

Pull Request

Description

Checklist

  • Module follows nf-core standards
  • main.nf includes process definition
  • meta.yml includes complete documentation
  • environment.yml specifies dependencies
  • Tests are included
  • Code is formatted (prettier)
  • CI checks pass

Module Type

  • New module
  • Module update
  • Bug fix
  • Documentation

Related Issues

Closes #

Summary by CodeRabbit

  • New Features

    • Added QPX export module for converting DIA-NN proteomics results to QPX Parquet datasets and MuData format
  • Updates

    • Updated PRIDEPY to version 0.0.15 with refreshed dependencies and container image versions

@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a new QPX_EXPORT Nextflow module for converting DIA-NN proteomics results to QPX Parquet datasets and MuData files, while also bumping the PRIDEPY module to version 0.0.15. The QPX module includes process definition, metadata contracts, Conda dependencies, comprehensive test coverage, and test data configuration. PRIDEPY updates are minimal version bumps across environment and container images.

Changes

PRIDEPY Version Update

Layer / File(s) Summary
Conda Dependency
modules/bigbio/pridepy/environment.yml
PRIDEPY version updated from 0.0.14 to 0.0.15 in Conda dependencies.
Container Images
modules/bigbio/pridepy/main.nf
Singularity and Docker container image tags and stub version reporting updated to 0.0.15.

QPX Module Addition

Layer / File(s) Summary
Module Contracts
modules/bigbio/qpx/meta.yml
Module metadata defines inputs (DIANN report, PG matrix, SDRF, DIANN log, project accession), outputs (qpx_dataset, mudata H5MU, versions.yml), tool URLs, license, and authors.
Conda Dependencies
modules/bigbio/qpx/environment.yml
Conda environment specifies channels and installs qpx via pip.
Process Implementation
modules/bigbio/qpx/main.nf
QPX_EXPORT process executes qpxc convert diann command with conditional Singularity/Docker container selection, builds CLI arguments, runs embedded Python to convert output to MuData, and provides stub behavior.
Test Configuration
modules/bigbio/qpx/tests/nextflow.config
Test process configuration sets ext.args to empty string and defines test parameters for qpx_version and matrix_qvalue.
Test Cases & Test Data
modules/bigbio/qpx/tests/main.nf.test, tests/config/test_data.config
Two test cases validate full execution and stub mode; global test data configuration adds QPX DIA-NN results reference pointing to existing test dataset.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • bigbio/nf-modules#31: Introduces the PRIDEPY module that is now being updated to version 0.0.15 in this PR.

Suggested labels

Review effort 3/5

Suggested reviewers

  • daichengxin
  • jpfeuffer
  • timosachsenberg

Poem

🐰 A pridepy hop to 0.0.15's door,
New QPX pathways to explore!
From DIANN to Parquet, MuData so bright,
Module tests dancing through the night.
Hopping forward with proteomics delight! 🧬✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Update the qpx module container version' is misleading because the PR primarily adds a new QPX module with environment, process, metadata, and tests—container version updates are only a secondary aspect. Revise the title to reflect the main change, such as 'Add QPX export module for DIA-NN results conversion' or 'Add qpx module with DIA-NN to QPX/MuData export functionality'.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modules/bigbio/pridepy/environment.yml`:
- Line 6: The environment.yml currently pins bioconda::pridepy=0.0.15 which
doesn't exist in Bioconda; update modules/bigbio/pridepy/environment.yml to
remove the bioconda:: prefix and install pridepy via pip (or change to an
available Bioconda version), e.g., move "pridepy=0.0.15" into the pip section or
replace with a Bioconda-available version (e.g., v0.0.12) so conda environment
resolution succeeds; target symbols: pridepy and environment.yml.

In `@modules/bigbio/pridepy/main.nf`:
- Around line 7-9: The Docker/BioContainers tag
'biocontainers/pridepy:0.0.15--pyhdfd78af_0' referenced in the container
expression (the ternary using workflow.containerEngine == 'singularity' &&
!task.ext.singularity_pull_docker_container) is invalid; replace the Docker
branch of that ternary with the correct BioContainers tag for pridepy 0.0.15
(use the build hash found on BioContainers/Quay/Docker Hub for that release) or
point both branches to the working Singularity URL; update the string after the
colon so the expression uses a valid image name.

In `@modules/bigbio/qpx/environment.yml`:
- Around line 5-7: The Conda environment has an unpinned pip dependency "qpx"
causing non-deterministic installs; update the dependencies block to pin qpx to
the confirmed PyPI release (qpx==1.0.1) or, if the project requires features
from the container's qpx==1.0.2, investigate and document why the container has
a non-PyPI version and either publish that version or adjust the environment to
match; modify the "qpx" entry in the dependencies -> pip list to include the
chosen pin and add a brief comment in the file explaining any deviation if you
opt to investigate/publish instead.

In `@modules/bigbio/qpx/meta.yml`:
- Around line 35-38: The diann_log metadata currently uses a too-narrow glob
("*.log") so files like report.log.txt are not matched; update the diann_log
entry's pattern value (the pattern field under diann_log) to a broader glob such
as "*log*" or a multi-extension glob like "*.{log,txt}" so that files like
report.log.txt are captured by the metadata.

In `@modules/bigbio/qpx/tests/main.nf.test`:
- Around line 18-33: The ZIP download/extraction needs timeouts and a zip-slip
guard: replace the plain new URL(zipUrl).openStream() used inside
zipFile.withOutputStream and the direct zip.getInputStream(entry) reads with
URLConnection/HttpURLConnection that sets connect and read timeouts before
opening the stream, and ensure streams are closed; when extracting entries from
new java.util.zip.ZipFile(...) validate each entry.name by resolving the
destination File (extractFile) to its canonical path and verifying it startsWith
the canonical path of resultsDir (skip and log any entries that don't), continue
to create parent dirs and write only after the canonical-path check to prevent
path traversal and hangs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 46e7461f-0e15-4a22-8a26-83a4852fbafd

📥 Commits

Reviewing files that changed from the base of the PR and between f8c10a9 and 67bd04a.

⛔ Files ignored due to path filters (2)
  • modules/bigbio/pridepy/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/bigbio/qpx/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (8)
  • modules/bigbio/pridepy/environment.yml
  • modules/bigbio/pridepy/main.nf
  • modules/bigbio/qpx/environment.yml
  • modules/bigbio/qpx/main.nf
  • modules/bigbio/qpx/meta.yml
  • modules/bigbio/qpx/tests/main.nf.test
  • modules/bigbio/qpx/tests/nextflow.config
  • tests/config/test_data.config

- defaults
dependencies:
- bioconda::pridepy=0.0.14
- bioconda::pridepy=0.0.15
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Is bioconda::pridepy 0.0.15 available on Bioconda or Anaconda?

💡 Result:

No, bioconda::pridepy 0.0.15 is not available on Bioconda or Anaconda. pridepy 0.0.15 exists on PyPI [1][2], but searches of Anaconda.org/bioconda show no pridepy package [3][4][5][6][7], and Bioconda recipe docs and GitHub do not list a pridepy recipe [8][9][10][11][12][13]. A Biocontainer exists on quay.io [14], but no conda package in bioconda channel. GitHub repo latest release is v0.0.12 [15], though PyPI has 0.0.15.

Citations:


Remove bioconda:: channel prefix or update the package source.

The package pridepy=0.0.15 does not exist in the Bioconda channel. It is only available on PyPI. Using bioconda::pridepy=0.0.15 will cause conda environment resolution to fail at runtime. Either specify the package with the pip package manager instead, or use a version that exists in Bioconda (if available). The latest GitHub release is v0.0.12.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/bigbio/pridepy/environment.yml` at line 6, The environment.yml
currently pins bioconda::pridepy=0.0.15 which doesn't exist in Bioconda; update
modules/bigbio/pridepy/environment.yml to remove the bioconda:: prefix and
install pridepy via pip (or change to an available Bioconda version), e.g., move
"pridepy=0.0.15" into the pip section or replace with a Bioconda-available
version (e.g., v0.0.12) so conda environment resolution succeeds; target
symbols: pridepy and environment.yml.

Comment on lines 7 to +9
container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container
? 'https://depot.galaxyproject.org/singularity/pridepy:0.0.14--pyhdfd78af_0'
: 'biocontainers/pridepy:0.0.14--pyhdfd78af_0'}"
? 'https://depot.galaxyproject.org/singularity/pridepy:0.0.15--pyhdfd78af_0'
: 'biocontainers/pridepy:0.0.15--pyhdfd78af_0'}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check Singularity depot URL availability
curl -sI "https://depot.galaxyproject.org/singularity/pridepy:0.0.15--pyhdfd78af_0" \
  | head -5

# Check Docker Hub / BioContainers tag availability
curl -s "https://registry.hub.docker.com/v2/repositories/biocontainers/pridepy/tags/0.0.15--pyhdfd78af_0/" \
  | python3 -c "import sys,json; d=json.load(sys.stdin); print(d.get('name','NOT FOUND'))"

Repository: bigbio/nf-modules

Length of output: 214


Fix Docker image reference: biocontainers/pridepy:0.0.15--pyhdfd78af_0 does not exist.

The Singularity image exists (HTTP 200), but the Docker/BioContainers tag does not exist in the registry. The workflow will fail when attempting to pull the Docker image. Verify the correct build hash for the 0.0.15 release on BioContainers and update line 9 accordingly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/bigbio/pridepy/main.nf` around lines 7 - 9, The Docker/BioContainers
tag 'biocontainers/pridepy:0.0.15--pyhdfd78af_0' referenced in the container
expression (the ternary using workflow.containerEngine == 'singularity' &&
!task.ext.singularity_pull_docker_container) is invalid; replace the Docker
branch of that ternary with the correct BioContainers tag for pridepy 0.0.15
(use the build hash found on BioContainers/Quay/Docker Hub for that release) or
point both branches to the working Singularity URL; update the string after the
colon so the expression uses a valid image name.

Comment on lines +5 to +7
dependencies:
- pip:
- qpx
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
python - <<'PY'
import json, urllib.request
data = json.load(urllib.request.urlopen("https://pypi.org/pypi/qpx/json"))
print("latest:", data["info"]["version"])
print("1.0.2 exists:", "1.0.2" in data.get("releases", {}))
PY

Repository: bigbio/nf-modules

Length of output: 2643


🏁 Script executed:

# Check if the file exists and verify its current state
cat -n modules/bigbio/qpx/environment.yml

Repository: bigbio/nf-modules

Length of output: 197


🌐 Web query:

qpx python package pypi version 1.0.2

💡 Result:

For the Python package named “qpx” on PyPI (BigBio’s QPX: Quantitative Proteomics eXchange), version 1.0.2 does not appear to be published. The PyPI project page shows a version history with 1.0.1 (released 2026-04-03) and 1.0.0 (released 2026-04-01), but no 1.0.2 entry is listed.[1] What you can do: 1) Check whether you meant a different “qpx” package name (there are other unrelated “qpx” identifiers online).[2] 2) If you have a requirement for exactly 1.0.2, you may need to install from the package’s GitHub or request that the maintainer publish that release, since it is not present in PyPI’s listed versions.[1][3]

Citations:


🏁 Script executed:

# Check the container specification in main.nf
cat -n modules/bigbio/qpx/main.nf | head -30

Repository: bigbio/nf-modules

Length of output: 1176


Pin qpx in the Conda environment for reproducibility.

Line 7 uses an unpinned pip dependency, so Conda executions can drift from the tested container behavior and produce non-deterministic outcomes.

However, the proposed pin qpx==1.0.2 cannot be used—that version does not exist on PyPI. The latest available version is 1.0.1 (released 2026-04-03). Consider pinning to qpx==1.0.1 or investigating why the container version (1.0.2) differs from PyPI's published releases.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/bigbio/qpx/environment.yml` around lines 5 - 7, The Conda environment
has an unpinned pip dependency "qpx" causing non-deterministic installs; update
the dependencies block to pin qpx to the confirmed PyPI release (qpx==1.0.1) or,
if the project requires features from the container's qpx==1.0.2, investigate
and document why the container has a non-PyPI version and either publish that
version or adjust the environment to match; modify the "qpx" entry in the
dependencies -> pip list to include the chosen pin and add a brief comment in
the file explaining any deviation if you opt to investigate/publish instead.

Comment on lines +35 to +38
- diann_log:
type: file
description: DIA-NN summary log for version detection
pattern: "*.log"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Broaden the DIANN log pattern to match actual filenames.

Line 38 currently restricts logs to *.log, but the module test input uses report.log.txt. Aligning the pattern avoids metadata drift and user confusion.

Proposed fix
   - diann_log:
       type: file
       description: DIA-NN summary log for version detection
-      pattern: "*.log"
+      pattern: "*.log*"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- diann_log:
type: file
description: DIA-NN summary log for version detection
pattern: "*.log"
- diann_log:
type: file
description: DIA-NN summary log for version detection
pattern: "*.log*"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/bigbio/qpx/meta.yml` around lines 35 - 38, The diann_log metadata
currently uses a too-narrow glob ("*.log") so files like report.log.txt are not
matched; update the diann_log entry's pattern value (the pattern field under
diann_log) to a broader glob such as "*log*" or a multi-extension glob like
"*.{log,txt}" so that files like report.log.txt are captured by the metadata.

Comment on lines +18 to +33
zipFile.withOutputStream { out ->
out << new URL(zipUrl).openStream()
}
def resultsDir = file("test_qpx_data")
resultsDir.mkdirs()
new java.util.zip.ZipFile(zipFile).withCloseable { zip ->
zip.entries().each { entry ->
if (!entry.isDirectory()) {
def extractFile = new File(resultsDir, entry.name)
extractFile.parentFile.mkdirs()
extractFile.withOutputStream { out ->
out << zip.getInputStream(entry)
}
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden ZIP download/extraction (timeouts + Zip Slip guard).

Line 19 has no connect/read timeout, and Line 26 trusts archive entry paths without canonical-path validation. That can hang CI and permits path traversal from crafted ZIPs.

Proposed fix
-            zipFile.withOutputStream { out ->
-                out << new URL(zipUrl).openStream()
-            }
+            def connection = new URL(zipUrl).openConnection()
+            connection.setConnectTimeout(30_000)
+            connection.setReadTimeout(120_000)
+            connection.inputStream.withCloseable { input ->
+                zipFile.withOutputStream { out -> out << input }
+            }
             def resultsDir = file("test_qpx_data")
             resultsDir.mkdirs()
             new java.util.zip.ZipFile(zipFile).withCloseable { zip ->
+                def canonicalRoot = resultsDir.canonicalFile
                 zip.entries().each { entry ->
                     if (!entry.isDirectory()) {
-                        def extractFile = new File(resultsDir, entry.name)
-                        extractFile.parentFile.mkdirs()
-                        extractFile.withOutputStream { out ->
-                            out << zip.getInputStream(entry)
-                        }
+                        def extractFile = new File(resultsDir, entry.name).canonicalFile
+                        if (!extractFile.path.startsWith(canonicalRoot.path + File.separator)) {
+                            throw new SecurityException("Blocked zip entry outside target dir: ${entry.name}")
+                        }
+                        extractFile.parentFile.mkdirs()
+                        zip.getInputStream(entry).withCloseable { input ->
+                            extractFile.withOutputStream { out -> out << input }
+                        }
                     }
                 }
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
zipFile.withOutputStream { out ->
out << new URL(zipUrl).openStream()
}
def resultsDir = file("test_qpx_data")
resultsDir.mkdirs()
new java.util.zip.ZipFile(zipFile).withCloseable { zip ->
zip.entries().each { entry ->
if (!entry.isDirectory()) {
def extractFile = new File(resultsDir, entry.name)
extractFile.parentFile.mkdirs()
extractFile.withOutputStream { out ->
out << zip.getInputStream(entry)
}
}
}
}
def connection = new URL(zipUrl).openConnection()
connection.setConnectTimeout(30_000)
connection.setReadTimeout(120_000)
connection.inputStream.withCloseable { input ->
zipFile.withOutputStream { out -> out << input }
}
def resultsDir = file("test_qpx_data")
resultsDir.mkdirs()
new java.util.zip.ZipFile(zipFile).withCloseable { zip ->
def canonicalRoot = resultsDir.canonicalFile
zip.entries().each { entry ->
if (!entry.isDirectory()) {
def extractFile = new File(resultsDir, entry.name).canonicalFile
if (!extractFile.path.startsWith(canonicalRoot.path + File.separator)) {
throw new SecurityException("Blocked zip entry outside target dir: ${entry.name}")
}
extractFile.parentFile.mkdirs()
zip.getInputStream(entry).withCloseable { input ->
extractFile.withOutputStream { out -> out << input }
}
}
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/bigbio/qpx/tests/main.nf.test` around lines 18 - 33, The ZIP
download/extraction needs timeouts and a zip-slip guard: replace the plain new
URL(zipUrl).openStream() used inside zipFile.withOutputStream and the direct
zip.getInputStream(entry) reads with URLConnection/HttpURLConnection that sets
connect and read timeouts before opening the stream, and ensure streams are
closed; when extracting entries from new java.util.zip.ZipFile(...) validate
each entry.name by resolving the destination File (extractFile) to its canonical
path and verifying it startsWith the canonical path of resultsDir (skip and log
any entries that don't), continue to create parent dirs and write only after the
canonical-path check to prevent path traversal and hangs.

@ypriverol ypriverol merged commit 17a895b into main May 7, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants