feat: add wiffconverter container with convert CLI#18
Conversation
Introduces a new container at `wiffconverter-0.10/` that wraps `sciex/wiffconverter:0.10` and exposes a single `convert` CLI on PATH for SCIEX `.wiff` (+ companion `.wiff.scan`) → indexed mzML conversion in one step. Designed for use by quantmsdiann to ingest AbSciex data natively without a separate indexing pass. Why a wrapper, not the upstream entrypoint: - `OneOmics.WiffConverter.exe` does not always propagate failure via its exit code, so the wrapper validates the output ends `</indexedmzML>` and contains an `<indexList>` element. - The output must be `indexedmzML` for downstream tools (DIA-NN, OpenMS), so the wrapper always passes `--index` — there is no opt-out flag. - A `convert --input X --output Y --mode centroid|profile` interface matches the DIA-NN / Relink containers' style (binary on PATH, flag-driven) and replaces the original positional `wiff-to-mzml` prototype. - On failure the wrapper prints a framed banner with input/output/mode and the last 40 lines of the underlying converter log, and retains the log file at `<output>.log` so the operator can diagnose without re-running. On success the log is auto-deleted unless `--log <path>` was passed. Validated end-to-end on PXD073289 / OA_5 (936 MB `.wiff.scan`): 192,076 spectra, 1.4 GB indexed mzML output, deterministic across runs. Failure path validated against a corrupt `.wiff` (`OpenMcdf` signature exception) — banner + log retention behave as designed. README documents the `convert` CLI, the always-`--index` guarantee, and the failure-banner behavior. The CI workflow change required to add a `build-wiffconverter` job is left for a separate commit (token used here lacks `workflow` scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 56 minutes and 15 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new WiffConverter container (v0.10): README docs, a Dockerfile extending Changes
Sequence DiagramsequenceDiagram
actor User
participant CLI as convert CLI
participant FS as Filesystem
participant Mono as Mono Runtime
participant OneOmics as OneOmics.WiffConverter.exe
User->>CLI: run convert --input file.wiff --output out.mzML --mode centroid
CLI->>CLI: parse args & validate presence of file.wiff and file.wiff.scan
CLI->>FS: verify input files exist
alt inputs missing
CLI->>FS: write diagnostics log
CLI->>User: print diagnostics & exit 2
else inputs present
CLI->>Mono: mono OneOmics.WiffConverter.exe --index --overwrite -centroid ...
Mono->>OneOmics: execute conversion
OneOmics->>FS: write indexed mzML output
CLI->>FS: read output file
CLI->>CLI: validate output (non-empty, contains <indexList>, ends with </indexedmzML>)
alt validation fails
CLI->>FS: tail last 40 log lines
CLI->>User: print failure details & exit 2
else validation succeeds
CLI->>FS: remove auto log if none provided
CLI->>User: print success summary (size, indexed=yes)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
wiffconverter-0.10/convert (1)
75-81: Consider creating the output directory before invoking mono.If
--outputpoints at a path whose parent directory doesn't exist (e.g. a typo, or a Nextflow staging path that hasn't been created), mono will fail with a generic error and the user has to dig through the framed log to figure out what happened. A one-linemkdir -pis cheap and turns this into a non-issue.♻️ Suggested addition before the mono invocation
+mkdir -p "$(dirname "$OUTPUT")" + # Make sure mono's stdout/stderr is captured to the log; tee it for live progress. status=0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wiffconverter-0.10/convert` around lines 75 - 81, Before invoking mono (the block running /usr/bin/mono "$WIFF_BIN" ... > >(tee "$LOG") 2>&1), ensure the directory for the OUTPUT path exists by creating its parent directory (use the OUTPUT variable to derive the parent) with mkdir -p; add this one-line check/creation immediately before the mono call so that missing parent directories are created and mono fails with a clear error only for real runtime issues.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 161-167: The README references a stale wrapper name "wiff-to-mzml"
that doesn't exist in the image; update the WiffConverter Container bullets to
reflect the actual installed CLI (/usr/local/bin/convert) and its invocation
name "convert", and ensure this matches the usage example elsewhere in README.md
(the install step that places the wrapper at /usr/local/bin/convert and any code
blocks invoking convert).
- Around line 269-272: The README currently claims "Builds and pushes
WiffConverter Docker and Singularity containers" but the workflow
`.github/workflows/quantms-containers.yml` contains only `build-diann` and
`build-relink` (and `sync-openms` depends on those), so update README.md to
reflect reality by either removing the WiffConverter line or marking the third
item ("Builds and pushes WiffConverter Docker and Singularity containers") as
"(planned)"/"(to be added)" until the `build-wiffconverter` job is added; ensure
the text mentions the actual workflow jobs `build-diann`, `build-relink`, and
`sync-openms` so the docs and workflow are consistent.
In `@wiffconverter-0.10/convert`:
- Around line 75-102: The current process substitution using >(...) with tee is
racy because the tee subshell may still be flushing when we check $status and
call fail(); replace it with a real pipe and capture mono's exit code via
PIPESTATUS (e.g., /usr/bin/mono ... 2>&1 | tee "$LOG"; status=${PIPESTATUS[0]})
so tee runs synchronously and $status reflects mono's exit, or alternatively
capture the PID of the tee subprocess when using the process substitution and
call wait on that PID before checking $status and invoking fail(); adjust the
mono invocation and the status-checking logic (the status variable, mono command
block, and the if [[ $status -ne 0 ]] ... fail() path) accordingly.
---
Nitpick comments:
In `@wiffconverter-0.10/convert`:
- Around line 75-81: Before invoking mono (the block running /usr/bin/mono
"$WIFF_BIN" ... > >(tee "$LOG") 2>&1), ensure the directory for the OUTPUT path
exists by creating its parent directory (use the OUTPUT variable to derive the
parent) with mkdir -p; add this one-line check/creation immediately before the
mono call so that missing parent directories are created and mono fails with a
clear error only for real runtime issues.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bd2d1684-a5ac-450f-8b6d-1cd6d7210b79
📒 Files selected for processing (3)
README.mdwiffconverter-0.10/Dockerfilewiffconverter-0.10/convert
| 1. Builds and pushes DIA-NN Docker and Singularity containers (all versions) | ||
| 2. Builds and pushes Relink Docker and Singularity containers | ||
| 3. Syncs OpenMS containers from the official repository to BigBio | ||
| 3. Builds and pushes WiffConverter Docker and Singularity containers | ||
| 4. Syncs OpenMS containers from the official repository to BigBio |
There was a problem hiding this comment.
CI/CD documentation is ahead of the actual workflow.
Per the workflow file (.github/workflows/quantms-containers.yml), the only build jobs are build-diann and build-relink, and sync-openms depends only on [build-diann, build-relink]. There is no build-wiffconverter job, so step 3 ("Builds and pushes WiffConverter Docker and Singularity containers") is not yet true. The PR description acknowledges this is intentional (workflow scope on the push token), with the workflow change deferred to a follow-up commit.
Until that follow-up lands, consider adjusting this list to avoid documenting behavior that doesn't exist yet — e.g. mark step 3 as "(planned)" or drop it from this PR and add it together with the workflow change.
📝 Suggested wording until the workflow job is added
1. Builds and pushes DIA-NN Docker and Singularity containers (all versions)
2. Builds and pushes Relink Docker and Singularity containers
-3. Builds and pushes WiffConverter Docker and Singularity containers
-4. Syncs OpenMS containers from the official repository to BigBio
+3. Syncs OpenMS containers from the official repository to BigBio
+
+_WiffConverter (`wiffconverter-0.10/`) is built manually for now; an automated
+`build-wiffconverter` job will be added in a follow-up PR._📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 1. Builds and pushes DIA-NN Docker and Singularity containers (all versions) | |
| 2. Builds and pushes Relink Docker and Singularity containers | |
| 3. Syncs OpenMS containers from the official repository to BigBio | |
| 3. Builds and pushes WiffConverter Docker and Singularity containers | |
| 4. Syncs OpenMS containers from the official repository to BigBio | |
| 1. Builds and pushes DIA-NN Docker and Singularity containers (all versions) | |
| 2. Builds and pushes Relink Docker and Singularity containers | |
| 3. Syncs OpenMS containers from the official repository to BigBio | |
| _WiffConverter (`wiffconverter-0.10/`) is built manually for now; an automated | |
| `build-wiffconverter` job will be added in a follow-up PR._ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 269 - 272, The README currently claims "Builds and
pushes WiffConverter Docker and Singularity containers" but the workflow
`.github/workflows/quantms-containers.yml` contains only `build-diann` and
`build-relink` (and `sync-openms` depends on those), so update README.md to
reflect reality by either removing the WiffConverter line or marking the third
item ("Builds and pushes WiffConverter Docker and Singularity containers") as
"(planned)"/"(to be added)" until the `build-wiffconverter` job is added; ensure
the text mentions the actual workflow jobs `build-diann`, `build-relink`, and
`sync-openms` so the docs and workflow are consistent.
|
Related with issue bigbio/quantmsdiann#71 |
- convert: replace `> >(tee ...)` process substitution with a regular pipe + PIPESTATUS so the script waits for tee to flush before fail() tails the log or success removes it (no orphan logs, no truncated failure banners). - README: the WiffConverter container ships a `convert` wrapper, not `wiff-to-mzml`.
fix: avoid tee race in convert and correct README wrapper name
There was a problem hiding this comment.
🧹 Nitpick comments (2)
wiffconverter-0.10/convert (2)
69-86: Optional: clean up partial output/log on interrupt.If the user
Ctrl-C's during a long conversion (or the container is killed), a partially-written$OUTPUTplus a$LOG(when auto-named) are left on disk. A small trap before launching mono keeps the workspace tidy and mirrors the success-path log cleanup.♻️ Suggested trap
KEEP_LOG=1 if [[ -z "$LOG" ]]; then LOG="${OUTPUT}.log" KEEP_LOG=0 # only retain on failure fi + +# On interrupt/termination, drop the half-written mzML; keep the log for diagnosis. +cleanup_on_signal() { + rm -f "$OUTPUT" + exit 130 +} +trap cleanup_on_signal INT TERM🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wiffconverter-0.10/convert` around lines 69 - 86, Add a trap before launching mono that removes the partially-written "$OUTPUT" and auto-named "$LOG" if the script is interrupted or killed: set a trap for INT TERM EXIT that checks KEEP_LOG==0 and then unlinks "$OUTPUT" and "$LOG" if they exist; ensure the trap is cleared (trap - INT TERM EXIT) immediately after the mono pipeline completes successfully (before the existing success-path log cleanup) so successful runs keep their outputs/logs. Reference the variables LOG, OUTPUT, KEEP_LOG and the mono/tee pipeline where PIPESTATUS is captured to locate where to add and clear the trap.
120-123: Relax the<indexListmatch to tolerate whitespace and attribute-less forms.
grep -q "<indexList "requires a literal space immediately after the tag name. mzML in practice uses<indexList count="…">, but the check would miss perfectly legal variants such as<indexList\t…>or a hypothetical<indexList>(without attributes). A small character-class makes it robust without weakening intent.♻️ Proposed fix
-if ! grep -q "<indexList " "$OUTPUT"; then +if ! grep -qE '<indexList[[:space:]>]' "$OUTPUT"; then fail "output is missing <indexList> element" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wiffconverter-0.10/convert` around lines 120 - 123, The current sanity check in the convert script uses grep -q "<indexList " which only matches a literal space; replace it with a whitespace-tolerant pattern and use extended regex (e.g. change the check to use grep -qE '<indexList([[:space:]]|>)' or equivalent) so it matches `<indexList>` and `<indexList` followed by any whitespace or attributes; update the grep invocation in the block containing the indexList check accordingly (the line to change is the grep in the convert script).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@wiffconverter-0.10/convert`:
- Around line 69-86: Add a trap before launching mono that removes the
partially-written "$OUTPUT" and auto-named "$LOG" if the script is interrupted
or killed: set a trap for INT TERM EXIT that checks KEEP_LOG==0 and then unlinks
"$OUTPUT" and "$LOG" if they exist; ensure the trap is cleared (trap - INT TERM
EXIT) immediately after the mono pipeline completes successfully (before the
existing success-path log cleanup) so successful runs keep their outputs/logs.
Reference the variables LOG, OUTPUT, KEEP_LOG and the mono/tee pipeline where
PIPESTATUS is captured to locate where to add and clear the trap.
- Around line 120-123: The current sanity check in the convert script uses grep
-q "<indexList " which only matches a literal space; replace it with a
whitespace-tolerant pattern and use extended regex (e.g. change the check to use
grep -qE '<indexList([[:space:]]|>)' or equivalent) so it matches `<indexList>`
and `<indexList` followed by any whitespace or attributes; update the grep
invocation in the block containing the indexList check accordingly (the line to
change is the grep in the convert script).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fc026b3e-e45f-4b40-816c-ff37b4cf932a
📒 Files selected for processing (2)
README.mdwiffconverter-0.10/convert
🚧 Files skipped from review as they are similar to previous changes (1)
- README.md
Wires wiffconverter-0.10 through the existing pipeline mirroring the relink job: - adds `wiffconverter-*/**` to PR path filters - adds `wiffconverter_0_10` paths-filter rule + `CHG_WC010` env var - emits `wiffconverter_matrix` / `has_wiffconverter` from detect-changes - new `build-wiffconverter` job runs after build-relink, gated to the bigbio org (matches relink), pushes to ghcr.io/bigbio/wiffconverter and the wiffconverter-sif Singularity tag, with `:latest` on release - adds build-wiffconverter to sync-openms `needs:`
Summary
wiffconverter-0.10/container wrapssciex/wiffconverter:0.10and adds aconvertCLI onPATHfor SCIEX.wiff→ indexed mzML in one step.indexedmzML(--indexis hard-wired); on failure prints a framed banner with the last 40 lines of the converter log and retains<output>.logfor diagnosis.convert --input X --output Y --mode centroid|profile.--indexguarantee + failure-banner behavior.Why a wrapper
OneOmics.WiffConverter.exedoes not always propagate failure via exit code → wrapper validates the output ends</indexedmzML>and has an<indexList>before reporting success.<indexList>for random access, so non-indexed output is always a bug here..wiff.scan, locked output, unsupported acquisition) are diagnosable from the console without re-running.Validation
Tested against PXD073289 (
OA_5.wiff+ 936 MB.wiff.scan):</indexedmzML>, has<indexList>, success banner printed, log auto-deleted. Result deterministic across two independent runs..wiff.scan): one-line message, exit 64/66..wiff): framed banner + last 40 lines ofOpenMcdf.CFFileFormatExceptiontraceback, log retained at<output>.log, exit 2.CI follow-up (separate commit needed)
This branch deliberately omits the
build-wiffconverterjob in.github/workflows/quantms-containers.yml— the token used to push this branch lacksworkflowscope. The diff is ready locally and will be applied in a follow-up commit so a maintainer withworkflowpermissions can land it.Test plan
docker build -t ghcr.io/bigbio/wiffconverter:0.10 wiffconverter-0.10/).wiff→ confirmconvert: success ... indexed=yesand</indexedmzML>tailworkflowscope)🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
New Features
Bug Fixes / Reliability
Chores