Releases: PacificBiosciences/HiFi-human-WGS-WDL
v3.3.1
What's Changed
Version Bumps
The previous workflow release noted an upgrade to StarPhase 2.0.1, but an incorrect image hash (for 2.0.0) was used.
Full Changelog: v3.3.0...v3.3.1
v3.3.0
What's Changed
Breaking Changes
- refactor: rename
Boolean gputoBoolean use_gpufor clarity
New Features
- feature: add option
Boolean use_parabricks_deepvariantto use Parabricks 4.7.0-1 DeepVariant for small variant calling (equivalent to DeepVariant 1.9.0)- Parabricks DeepVariant requires 4 NVIDIA GPUs, 48 threads, and 192GiB, and replaces all three steps of standard Google DeepVariant.
- In our hands on HPC with 4xA100, the runtime is ~20 minutes, plus 10 minutes of VCF post-processing.
- The version of Parabricks DeepVariant packaged in the clara-parabricks v4.7.0-1 image is equivalent to standard Google DeepVariant v1.9.0, however there will likely be some small differences in the output. This is expected.
- We have only tested this version on HPC so far.
- docs: add documentation for Parabricks DeepVariant
Version Bumps
- bump: update HiPhase to 1.6.0
- bump: update pbmm2 to 26.1.0
- bump: update pb-StarPhase to 2.0.1
- bump: update Mitorsaw to 0.2.7
- bump: update Paraphase to 3.5.0
- bump: update MethBat to 0.17.0
- bump: update DeepVariant to 1.10.0
Refactors
- refactor: Update all workflows, tasks, and subworkflows for sprocket compatibility
- reformat meta and parameter_meta
- symlink most inputs into working directory
- quote most filenames
- refactor: increase resource allocation for tasks that are not IO-bound
- refactor: remove GitHub Actions; we're now testing internally
- refactor: removing wdl-common submodule to reduce complexity
Full Changelog: v3.2.1...v3.3.0
v3.2.1
What's Changed
- Reverting a previous change to how DeepVariant was called that resulted in increased memory usage.
Thanks @yttria-aniseia and @brentp for reporting and confirming the fix.
Full Changelog: v3.2.0...v3.2.1
v3.2.0
It is recommended for everyone using v3.* to upgrade to this release to improve phasing accuracy.
Change Log
This minor release updates TRGT to v5.0.0, and moves the TRGT task downstream of variant/read phasing. The TRGT genotypes are still in-phase, relative to the phased small variants, structural variants, and haplotagged reads, but the resulting phase blocks are more accurate and slightly shorter.
In addition, the TRGT coverage dropout script has been replaced and the new script should be both faster and more accurate, with runtimes of ~2 minutes for 1M sites. There's more documentation about the new format on the wiki.
Finally, BAM merging will be skipped upstream of DeepVariant, which should result in a slight decrease in overall wall time.
What's Changed
- bump: Update TRGT to v5.0.0 by @williamrowell in #272
Full Changelog: v3.1.1...v3.2.0
v3.1.1
v3.1.0
What's Changed
Breaking Changes
- new static resource bundle v3.1.0 available from 10.5281/zenodo.17086906
- updated recommended TRGT repeat catalog:
adotto_strchive_20250827.hg38.bed.gz - new ref_map key:
methbat_region_tsvwith recommended valuecpgIslandExt.sorted.hg38.tsv
- updated recommended TRGT repeat catalog:
- renamed outputs:
stat_num_reads->stat_read_countstat_mapped_percent->stat_mapped_read_percentstat_mean_depth->stat_depth_mean
New Inputs
- optional sample input:
Array[File]? fail_reads - new ref_map key:
methbat_region_tsv
New Outputs
- from new methbat task:
File?orArray[File?] methbat_profileStringorArray[String] stat_methbat_methylated_countStringorArray[String] stat_methbat_unmethylated_countStringorArray[String] stat_methbat_asm_count
- sawfish_call task:
FileorArray[File] sv_copynum_summary
- bam_stats task:
StringorArray[String] stat_gap_compressed_identity_meanStringorArray[String] stat_gap_compressed_identity_median
Change Log
- new Methbat task using v0.15.0 #257
- new added
methbat_region_tsvinput inref_map - generates
methbat_profileoutput with methylation status of defined CpG islands
- new added
- TRGT v4.0.0 #245
- added new optional
fail_readsinput, only used for TRGT genotyping; requires flagging regions in TRGT bed file for which to includefail_reads, as described in our TRGT documentation #253 - including fail_reads helps with sensitivity for some repeat expansions, notably FXN, RFC1, and FGF14
- added new optional
- Sawfish v2.1.1 #255
- added new Copy-number summary output, useful for evaluating chromosome copy-number
- Paraphase v3.3.4 #247
- Mitorsaw v0.2.4 #256
- added read length n50 and gap-compressed identity mean/median to output stats #254
- renamed some output stats for more consistent naming (e.g.,
<metric_name>_<count/mean/median>) #254 - modified plotting/aggregation python scripts to use less memory #250
Full Changelog: v3.0.2...v3.1.0
v3.0.2
What's Changed
- bump: Update Sawfish to 2.0.3 and Sawshark to 0.3.0 by @williamrowell in #239
- Docs/update-zenodo by @williamrowell in #241
- Co-authored-by: Helena hexylena@galaxians.org
- bump: Update mitorsaw to v0.2.3 by @williamrowell in #242
Full Changelog: v3.0.1...v3.0.2
v3.0.1
The family workflow in v3.0.0 had a critical bug where VCFs were provided as input to glnexus rather than gVCFs. Please upgrade to this release.
What's Changed
- fix: Add pbtk to the image manifest. by @williamrowell in #230
- fix: GLnexus inputs should be gVCFs, not VCFs (#231) by @williamrowell in #232
Full Changelog: v3.0.0...v3.0.1
v3.0.0
Major tool changes/additions
- replace pbsv and HiFiCNV with Sawfish v2.0.1
- annotate ALU/L1/SVA insertions/deletions with Sawshark v0.2.0 (for pbsv compatibility)
- add Mitorsaw v0.2.1 for mitochondrial variant calling (including heteroplasmy)
- new resource bundle https://zenodo.org/records/15750792
Mechanistic updates
- input BAMs are broken into chunks for parallel pbmm2 alignment, reducing the wall-time significantly in the cloud or on HPCs with high availability (e.g., >5h to ~1h on our HPC)
- inferred sex (based on chrY coverage) is used for all tasks that use allosome karyotype (Sawfish, TRGT, pedigree)
- new msg and msg_file outputs:
- expose warning messages that would otherwise be buried in task logs, e.g., input is already aligned, input lacks basemods, input has kinetics tags, reported sex does not match inferred sex, etc.
Tool version bumps
- update DeepVariant to v1.9.0
- note: slightly modified task runtime.gpu syntax, could potentially affect HPC+GPU users; update to newest miniwdl-slurm or reach out to support@pacb.com if you have any questions
- update HiPhase to v1.5.0
- update StarPhase to v1.4.1
- update TRGT to v3.0.0
- update Paraphase to v3.3.2
- modified such that Paraphase task failure no longer causes workflow to fail
Tertiary analysis updates
- update CoLoRSdb VCFs to v1.2.0
- update loss of function constraint lookups to gnomAD v4.1
- update ClinVar Annotations
- pedigree files are generated using native WDL functions (reduces complexity, improves consistency across platforms)