|
1 | 1 | Name: "1000 Genomes Phase 3 Reanalysis with DRAGEN 3.5, 3.7, 4.0, 4.2, and 4.4" |
2 | 2 | Description: | |
3 | | - # Description |
| 3 | + <b> Overview </b><p> |
4 | 4 |
|
5 | | - ## Overivew |
6 | | -
|
7 | | - This dataset contains alignment files and small variant (includes single nucleotide variants (SNV) and indels), copy number variant (CNV), short tandem repeat (*i.e.*, repeat expansion; STR), structural variant (SV) and other variant call files from the [1000 Genomes Project (1KGP) Phase 3 dataset](https://www.internationalgenome.org/) (3,202 individuals, 602 trios) using Illumina DRAGEN v3.5.7b, v3.7.6, v4.0.3, v4.2.7, and v4.4.7 software. |
| 5 | + This dataset contains alignment files and small variant (includes single nucleotide variants (SNV) and indels), copy number variant (CNV), short tandem repeat (i.e., repeat expansion; STR), structural variant (SV) and other variant call files from the [1000 Genomes Project (1KGP) Phase 3 dataset](https://www.internationalgenome.org/) (3,202 individuals, 602 trios) using Illumina DRAGEN v3.5.7b, v3.7.6, v4.0.3, v4.2.7, and v4.4.7 software. |
8 | 6 | All DRAGEN analyses were performed in the cloud using the [Illumina Connected Analytics](https://www.illumina.com/products/by-type/informatics-products/connected-analytics.html) bioinformatics platform powered by Amazon Web Services (see ['Data solution empowering population genomics'](https://www.illumina.com/science/genomics-research/articles/data-solution-empowering-population-genomics-research.html) for more information). |
9 | 7 | The v3.7.6, v4.2.7, and v4.4.7 datasets include results from trio small variant, *de novo* structural variant, and *de novo* copy number variant calls on 602 trio families comprised of members from the 1KGP Phase 3 dataset. |
10 | 8 | Trio repeat expansion calling was included in the v3.7.6 dataset only. |
11 | 9 | Joint cohort analysis was also performed on the entire 1KGP sample dataset for the v3.7.6, v4.0.3, v4.2.7, and v4.4.7 re-analyses using [DRAGEN Iterative gVCF Genotyper](https://www.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis/iterative-GVCF-genotyper.html) v3.8.3, v4.2.0, v4.2.7, v4.4.7, respectively (see ['Genotyping variants at population scale using DRAGEN gVCF Genotyper'](https://www.illumina.com/science/genomics-research/articles/gVCF-Genotyper.html) and ['Population Genotyping'](https://help.dragen.illumina.com/product-guide/dragen-v4.4/dragen-dna-pipeline/iterative-gvcf-genotyper)). |
12 | 10 |
|
13 | | - ## DRAGEN Versions |
| 11 | + <b> DRAGEN Versions </b><p> |
14 | 12 |
|
15 | | - ### v3.7 |
| 13 | + ##### v3.7 |
16 | 14 |
|
17 | 15 | [User Guide](https://support.illumina.com/content/dam/illumina-support/documents/documentation/software_documentation/dragen-bio-it/Illumina-DRAGEN-Bio-IT-Platform-User-Guide-1000000141465-00.pdf) | [Release Notes](https://www.illumina.com/content/dam/illumina-support/documents/downloads/software/dragen/release-notes/Illumina-DRAGEN-Bio-IT-Platform-3.7-Release-Notes-1000000142362-v00.pdf) |
18 | 16 |
|
19 | 17 | Improvements and new features in the v3.7.6 individual samples analyses include *CYP2D6* variant calling (see '[Overcoming high homology to detect variation in CYP21A2 with whole-genome sequencing in DRAGEN](https://www.illumina.com/science/genomics-research/articles/CYP21A2.html)') and joint detection and use of graph-based hg19 and hg38 reference hash tables (see ['DRAGEN Wins at PrecisionFDA Truth Challenge V2 Showcase Accuracy Gains from Alt-aware Mapping and Graph Reference Genomes'](https://www.illumina.com/science/genomics-research/dragen-wins-precisionfda-challenge-showcase-accuracy-gains.html) and ['Demystifying the versions of GRCh38/hg38 reference genomes, how they are used in DRAGEN and their impact on accuracy'](https://www.illumina.com/science/genomics-research/articles/dragen-demystifying-reference-genomes.html) for details). |
20 | 18 |
|
21 | | - ### v4.0 |
| 19 | + ##### v4.0 |
22 | 20 |
|
23 | 21 | [User Guide](https://support-docs.illumina.com/SW/DRAGEN_v40/Content/SW/FrontPages/DRAGEN.htm) | [Release Notes](https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/dragen/release-notes/200024449_01_DRAGEN-4.0-Customer-Release-Notes.pdf) |
24 | 22 |
|
25 | 23 | The DRAGEN v4.0.3 dataset features improved small variant calling accuracy due to utilization of a newly integrated [machine learning functionality](https://support-docs.illumina.com/SW/dragen_v42/Content/SW/DRAGEN/ml_for_vc.htm?Highlight=dragen-ml) with an updated graph based reference for difficult to map regions (see ['DRAGEN Sets New Standard for Data Accuracy in PrecisionFDA Benchmark Data. Optimizing Variant Calling Performance with Illumina Machine Learning and DRAGEN Graph'](https://www.illumina.com/science/genomics-research/articles/dragen-shines-again-precisionfda-truth-challenge-v2.html)); accuracy and runtime improvements in the SV caller; new targeted callers including *CYP2B6*, *GBA*, *SMN* and a Star Allele PGx caller; and an expanded catalog for use with Expansion Hunter STR caller. |
26 | 24 |
|
27 | | - ### v4.2 |
| 25 | + ##### v4.2 |
28 | 26 |
|
29 | 27 | [User Guide](https://support-docs.illumina.com/SW/dragen_v42/Content/SW/FrontPages/DRAGEN.htm) | [Release Notes](https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/dragen/release-notes/200040845_02_DRAGEN-4.2-Customer-Release-Notes.pdf) |
30 | 28 |
|
31 | 29 | DRAGEN v4.2.7 offers significant accuracy improvements in small variant, CNV, and SV calling, includes new targeted callers (*HBA*, *LPA*, *RH*, *CYP21A2*, *SMN* silent carrier variant), and supports Star Allele calling for five additional pharmacogenes (*BCHE*, *ABCG2*, *NAT2*, *F5*, and *UGT2B17*). |
32 | 30 | These are further improved by upgraded machine learning models. |
33 | 31 | See [DRAGEN 4.2: Enhanced machine learning, new targeted callers, and more](https://developer.illumina.com/news-updates/dragen-4-2-enhanced-machine-learning-new-targeted-callers-and-more) for further details on these and other enchancements. |
34 | 32 |
|
35 | | - ### v4.4 |
| 33 | + ##### v4.4 |
36 | 34 |
|
37 | 35 | [User Guide](https://help.dragen.illumina.com/product-guide/dragen-v4.4) | [Release Notes](https://www.illumina.com/content/dam/illumina-support/documents/downloads/software/dragen/release-notes/200068065_00_DRAGEN-4_4_4-Customer-Release-Notes.pdf) |
38 | 36 |
|
39 | 37 | DRAGEN v4.4.7 boosts the speed and accuracy of all callers via the official release of an optimized pangenome graph reference ('[The quest for accuracy gains in the dark regions of the genomes: Presenting the DRAGEN multigenome mapper and pangenome reference updates in version 4.3](https://www.illumina.com/science/genomics-research/articles/second-gen-multigenome-mapping.html)'). |
40 | 38 | Namely, SV calling accuracy is substantially increased via the implementation of a multigenome mapper capable of exploiting the power of a pangenome reference. |
41 | 39 | Runtime is further reduced by supporting AWS F2 EC2 instances ([Enabling Rapid Genomic and Multiomic Data Analysis with Illumina DRAGEN™ v4.4 on Amazon EC2 F2 Instances](https://aws.amazon.com/blogs/hpc/enabling-rapid-genomic-and-multiomic-data-analysis-with-illumina-dragen-v4-4-on-amazon-ec2-f2-instances/)) |
42 | 40 |
|
43 | | - ## Annotation |
| 41 | + <b> Annotation </b><p> |
44 | 42 |
|
45 | 43 | Starting with the v4.0.3 reanalysis, annotation using the Illumina Connected Annotations (also known as Illumina Annotation Engine or Nirvana) was included as part of the analysis (see [Illumina Connected Annotations documentation](https://help.dragen.illumina.com/product-guide/dragen-v4.4/nirvana) for more information). |
46 | 44 | For the v4.0.3, v4.2.7, and v4.4.7 datasets, annotation was performed on the merged small variant VCF generated by the DRAGEN Iterative gVCF Genotyper for the entire 1KGP cohort. |
|
0 commit comments