Skip to content

Comments

[DRAFT] Feature/ploidy support#2091

Draft
zhliUU wants to merge 3 commits intonf-core:devfrom
zhliUU:feature/ploidy-support
Draft

[DRAFT] Feature/ploidy support#2091
zhliUU wants to merge 3 commits intonf-core:devfrom
zhliUU:feature/ploidy-support

Conversation

@zhliUU
Copy link

@zhliUU zhliUU commented Jan 12, 2026

[DRAFT]: Add per-sample ploidy support for HaplotypeCaller

Summary

  • Add ploidy column to input schema (integer, default: 2, minimum: 1)
  • Pass ploidy to HaplotypeCaller via --sample-ploidy in ext.args
  • Add test profiles and CSV files for haploid (ploidy=1) and triploid (ploidy=3)
  • Skip VCFtools for polyploid samples (ploidy > 2) since VCFtools TsTv does not support polyploid genotypes

Discussion: Per-sample vs global ploidy

This PR introduces per-sample ploidy via the input CSV. This differs from existing global parameters like ascat_ploidy and cf_ploidy.

Per-sample ploidy (CSV input):

  • Allows mixed ploidy samples in one run
  • More accurate for heterogeneous datasets
  • Currently implemented for: HaplotypeCaller

Global ploidy (parameters):

  • Simpler for uniform datasets
  • Currently used by power users: ASCAT (ascat_ploidy to override optimization), ControlFREEC (cf_ploidy "type": "string")

Questions for maintainers:

  1. Should we keep both approaches (per-sample CSV + global parameter)?
  2. Should per-sample ploidy be extended to other tools (FreeBayes, CNVkit, TIDDIT)?
  3. How should per-sample ploidy interact with existing global parameters?

Changes

File Description
assets/schema_input.json Add ploidy field (integer, default: 2, min: 1)
conf/modules/haplotypecaller.config Pass --sample-ploidy to GATK4_HAPLOTYPECALLER
conf/modules/modules.config Skip VCFTOOLS for ploidy > 2
conf/test/tools_germline_haploid.config Test profile for haploid samples
conf/test/tools_germline_triploid.config Test profile for triploid samples
nextflow.config Register new test profiles
tests/csv/3.0/recalibrated_germline_*.csv Test input CSVs with ploidy column

Test plan

  • Test haploid sample (ploidy=1):

    nextflow run . -profile test,tools_germline_haploid,docker \
      --outdir results_haploid \
      --tools haplotypecaller \
      --skip_tools haplotypecaller_filter \
      -resume
  • Test triploid sample (ploidy=3):

    nextflow run . -profile test,tools_germline_triploid,docker \
      --outdir results_triploid \
      --tools haplotypecaller \
      --skip_tools haplotypecaller_filter \
      -resume
  • Verify VCFtools is skipped for triploid samples

  • Verify default ploidy=2 works for existing pipelines (backwards compatible: nextflow run . -profile test,docker --outdir <OUTDIR>).

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

FriederikeHanssen and others added 3 commits December 11, 2025 15:38
- Add ploidy column to input schema (integer, default: 2, min: 1)
- Pass ploidy to HaplotypeCaller via --sample-ploidy in ext.args
- Default ploidy is 2 for diploid organisms (human, mouse, etc.)
- Add test profiles and CSV files for haploid (ploidy=1) and triploid (ploidy=3)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
VCFtools TsTv calculations do not support polyploid genotypes.
Skip VCFTOOLS processes when sample ploidy exceeds 2.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@zhliUU zhliUU marked this pull request as draft January 12, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants