Skip to content

Conversation

@FelixMoelder
Copy link
Contributor

@FelixMoelder FelixMoelder commented Dec 12, 2025

This allows to pass samtools parameters to the wrapper allowing to e.g. specify the cram version created by samtools.
Doing so might be necessary as some tools like GATK and Picard do not yet support cram 3.1 as input files.

QC

While the contributions guidelines are more extensive, please particularly ensure that:

  • test.py was updated to call any added or updated example rules in a Snakefile
  • input: and output: file paths in the rules can be chosen arbitrarily
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:)
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to
  • the meta.yaml contains a link to the documentation of the respective tool or command under url:
  • conda environments use a minimal amount of channels and packages, in recommended ordering

Summary by CodeRabbit

  • New Features
    • CRAM output now respects and merges user-provided samtools options with defaults, allowing more flexible samtools configuration during duplicate-marking.
    • This change only augments parameter handling for CRAM processing; no other behavioral changes introduced.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 12, 2025

📝 Walkthrough

Walkthrough

Adds passing of user-provided samtools options into the Picard markduplicates wrapper and a test Snakefile param that sets a samtools option for CRAM output.

Changes

Cohort / File(s) Summary
Test configuration
bio/picard/markduplicates/test/Snakefile
Adds samtools_opts="--output-fmt-option version=3.0" to the markduplicates_cram rule parameters.
Wrapper update
bio/picard/markduplicates/wrapper.py
Builds samtools_opts by concatenating snakemake.params.get("samtools_opts", "") with the wrapper's get_samtools_opts(snakemake) result, so user options are included alongside existing samtools options.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Review focus:
    • Confirm concatenation handles empty or whitespace-only user input correctly (no duplicate spaces).
    • Verify final samtools_opts formatting is valid for samtools and for CRAM output.

Suggested reviewers

  • johanneskoester

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The description explains the purpose (passing samtools parameters) and use case (CRAM version compatibility), but the QC checklist items are all unchecked, suggesting incomplete validation against contribution guidelines. Verify and check off the QC checklist items, particularly confirming test.py was updated and contribution guidelines were followed, to ensure full compliance.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: samtools_opts for picard markduplicates' is concise, follows conventional commit style, and accurately summarizes the main change of adding samtools_opts parameter support.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/picard_samtools_opts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bio/picard/markduplicates/wrapper.py (1)

19-19: Consider cleaner string joining to avoid potential leading spaces.

The current concatenation could produce a leading space when samtools_opts is empty (e.g., "" + " " + "..."). While this won't cause functional issues in shell commands, a cleaner approach would be to filter out empty strings.

Apply this diff for cleaner string handling:

-samtools_opts = snakemake.params.get("samtools_opts", "") + " " + get_samtools_opts(snakemake)
+samtools_opts = " ".join(filter(None, [snakemake.params.get("samtools_opts", ""), get_samtools_opts(snakemake)]))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab5eca0 and 3c77daa.

📒 Files selected for processing (2)
  • bio/picard/markduplicates/test/Snakefile (1 hunks)
  • bio/picard/markduplicates/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/picard/markduplicates/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/picard/markduplicates/wrapper.py
🧠 Learnings (2)
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2025-05-15T07:35:14.369Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3728
File: bio/bwameth/memx/wrapper.py:32-32
Timestamp: 2025-05-15T07:35:14.369Z
Learning: In the bwameth wrapper, when `snakemake.threads` equals 1, `samtools_threads` is intentionally set to 0. This design allows the wrapper to skip adding the `--threads` parameter to samtools, prioritizing the single available thread for the primary alignment task.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
🪛 GitHub Actions: Code quality
bio/picard/markduplicates/wrapper.py

[error] 1-1: Black check failed. 1 file would be reformatted. Run 'black' to format code.

🪛 Ruff (0.14.8)
bio/picard/markduplicates/wrapper.py

19-19: Undefined name snakemake

(F821)


19-19: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: docs
  • GitHub Check: testing
  • GitHub Check: Summary
🔇 Additional comments (2)
bio/picard/markduplicates/test/Snakefile (1)

59-59: LGTM! Appropriate test for the new functionality.

The test correctly exercises the new samtools_opts parameter with a valid samtools option for controlling CRAM output version. This aligns with the PR's objective of ensuring compatibility with tools that don't support CRAM 3.1.

bio/picard/markduplicates/wrapper.py (1)

1-64: > Likely an incorrect or invalid review comment.

Remove debug print statement from wrapper.py
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
bio/picard/markduplicates/wrapper.py (1)

19-21: Consider stripping to avoid extra spaces.

The unconditional space separator can produce a leading space when samtools_opts is empty or a trailing space when get_samtools_opts(snakemake) returns an empty string. While harmless, stripping the result improves cleanliness.

Apply this diff to strip extra whitespace:

-samtools_opts = (
-    snakemake.params.get("samtools_opts", "") + " " + get_samtools_opts(snakemake)
-)
+samtools_opts = (
+    snakemake.params.get("samtools_opts", "") + " " + get_samtools_opts(snakemake)
+).strip()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c77daa and 21cbbdb.

📒 Files selected for processing (1)
  • bio/picard/markduplicates/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • bio/picard/markduplicates/wrapper.py
**/wrapper.py

⚙️ CodeRabbit configuration file

Do not complain about use of undefined variable called snakemake.

Files:

  • bio/picard/markduplicates/wrapper.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/test/Snakefile:2-6
Timestamp: 2024-11-26T08:31:00.099Z
Learning: In test files for Snakemake wrappers, such as `bio/mtnucratio/test/Snakefile`, hard-coded input and output paths are acceptable as examples and do not need to use wildcards to make paths flexible.
📚 Learning: 2024-11-21T10:23:03.427Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3478
File: bio/varlociraptor/estimate-alignment-properties/wrapper.py:5-12
Timestamp: 2024-11-21T10:23:03.427Z
Learning: In the Snakemake wrappers project, avoid suggesting extensive error handling or temporary file management in simple wrapper scripts when it may be unnecessary, to prevent overcomplicating the code.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In `wrapper.py` scripts, do not flag the use of an undefined variable called `snakemake`.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2025-05-15T07:35:14.369Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3728
File: bio/bwameth/memx/wrapper.py:32-32
Timestamp: 2025-05-15T07:35:14.369Z
Learning: In the bwameth wrapper, when `snakemake.threads` equals 1, `samtools_threads` is intentionally set to 0. This design allows the wrapper to skip adding the `--threads` parameter to samtools, prioritizing the single available thread for the primary alignment task.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2025-05-15T09:46:23.189Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3728
File: bio/bwameth/memx/wrapper.py:55-55
Timestamp: 2025-05-15T09:46:23.189Z
Learning: In the `bio/bwameth/memx/wrapper.py` file, string interpolation for variables like `{samtools_threads}` is intentionally delayed using string placeholders that are resolved later with a single `format(**locals())` call rather than using f-strings for immediate interpolation.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2024-11-26T08:35:42.140Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3497
File: bio/sexdeterrmine/wrapper.py:23-26
Timestamp: 2024-11-26T08:35:42.140Z
Learning: In the `bio/sexdeterrmine/wrapper.py` file (Python), we rely on Samtools to handle input validation for the depth file provided by the user, so additional file existence checks are not necessary.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2024-10-08T17:41:54.542Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-wrappers PR: 3123
File: utils/datavzrd/wrapper.py:31-32
Timestamp: 2024-10-08T17:41:54.542Z
Learning: The `snakemake` variable is inserted via a preamble during execution in `wrapper.py` scripts, so it doesn't need to be explicitly defined.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
📚 Learning: 2024-11-26T08:30:23.818Z
Learnt from: tdayris
Repo: snakemake/snakemake-wrappers PR: 3496
File: bio/mtnucratio/wrapper.py:27-28
Timestamp: 2024-11-26T08:30:23.818Z
Learning: In Snakemake wrappers (e.g., `wrapper.py` files), it's unnecessary to verify the availability of tools like `mtnucratio` within the code, because Snakemake with Conda ensures that the required tools are installed and available.

Applied to files:

  • bio/picard/markduplicates/wrapper.py
🪛 Ruff (0.14.8)
bio/picard/markduplicates/wrapper.py

20-20: Undefined name snakemake

(F821)


20-20: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: testing
  • GitHub Check: docs
  • GitHub Check: Summary
🔇 Additional comments (1)
bio/picard/markduplicates/wrapper.py (1)

19-21: [rewritten comment]
[classification tag]

Copy link
Contributor

@fxwiegand fxwiegand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update the docs with the additional parameters:

params:
- java_opts: allows for additional arguments to be passed to the java compiler, e.g. "-XX:ParallelGCThreads=10" (not for `-XmX` or `-Djava.io.tmpdir`, since they are handled automatically).
- extra: allows for additional program arguments.
- embed_ref: allows to embed the fasta reference into the cram
- withmatecigar: allows to run `MarkDuplicatesWithMateCigar <https://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicatesWithMateCigar>`_ instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants