Skip to content

Commit fa30371

Browse files
committed
docs: Add section on detecting coverage dropouts and interpreting dropout columns in TRGT documentation
1 parent 7de156c commit fa30371

1 file changed

Lines changed: 23 additions & 0 deletions

File tree

docs/trgt.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,26 @@ chr4 39348424 39348483 ID=CANVAS_RFC1;MOTIFS=AAGGG,ACAGG,AGGGC,
1111
chr9 69037270 69037304 ID=FRDA_FXN;MOTIFS=A,GAA;STRUC=<TR>;INCLUDE_FAIL_READS
1212
chr13 102161574 102161726 ID=SCA27B_FGF14;MOTIFS=GAA,GAAGGA,GAAGAAGAAGAAGCA,AAGGAG;STRUC=<TR>;INCLUDE_FAIL_READS
1313
```
14+
15+
## Detecting coverage dropouts
16+
17+
Dropouts in coverage at TRGT catalog loci may indicate the presence of large expansions that are not fully spanned by HiFi reads. To detect such dropouts, we run a script (`find_trgt_dropouts.py`) after TRGT genotyping to compares the observed coverage at each locus with a fixed threshold (2 reads per expected haplotype) and reports abnormal coverage in `*.trgt.dropouts.txt`. The columns in this file are as follows:
18+
19+
- chrom
20+
- start
21+
- end
22+
- trid
23+
- expected_ploidy
24+
- hap1_count
25+
- hap2_count
26+
- unphased_count
27+
- fail_read_count
28+
- dropout (FullDropout, HaplotypeDropout, PhasingDropout)
29+
30+
The `fail_read_count` column indicates the number of `fail_reads` that aligned to the locus. This can help interpret dropouts at loci where `fail_reads` were included for genotyping.
31+
32+
The dropout column can be interpreted as follows:
33+
34+
- FullDropout: total HiFi read count (hap1 + hap2 + unphased) is less than `expected_ploidy * 2`
35+
- HaplotypeDropout: one expected haplotype has fewer than 2 HiFi reads
36+
- PhasingDropout: the total HiFi read depth is at least `expected_ploidy * 2`, but both haplotypes have fewer than 2 reads

0 commit comments

Comments
 (0)