Skip to content

Add variantextractor module#10894

Open
sbreuch wants to merge 3 commits intonf-core:masterfrom
sbreuch:new-module-variant-extractor
Open

Add variantextractor module#10894
sbreuch wants to merge 3 commits intonf-core:masterfrom
sbreuch:new-module-variant-extractor

Conversation

@sbreuch
Copy link

@sbreuch sbreuch commented Mar 13, 2026

PR checklist

Closes #9996

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test variantextractor --profile docker
      • nf-core modules test variantextractor --profile singularity
      • nf-core modules test variantextractor --profile conda

Description

Add new module for variant-extractor (EUCANCan), a Python library for
deterministic extraction and homogenization of SNVs, indels and structural
variants (SVs) from VCF files.

The tool has no CLI — the module wraps the Python API directly via an inline
python3 script using pysam + VariantExtractor.

Related: nf-core/variantbenchmarking#264

Comment on lines +27 to +46
import pysam
from variant_extractor import VariantExtractor

vcf_in = pysam.VariantFile("${vcf}")
header = str(vcf_in.header).rstrip("\\n")
vcf_in.close()

extractor = VariantExtractor(
"${vcf}",
pass_only=${pass_only},
ensure_pairs=${ensure_pairs}
)

with open("${prefix}.extracted.vcf", "w") as out:
out.write(header + "\\n")
for variant_record in extractor:
out.write(str(variant_record) + "\\n")

extractor.close()
PYEOF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a template for this. There are a few examples in modules if you search for template such as the module ANNDATA_BARCODES.

You might need to remove the def before the variables so the python script can access the groovy variable values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

new module: VARIANT-EXTRACTOR

2 participants