-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathATACseqQC.Rmd
43 lines (36 loc) · 2.31 KB
/
ATACseqQC.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
title: "ATAC-Seq Data Analysis with Human Data - Perform ATACseq quality control using the `{ATACseqQC}`"
author: "Anni Liu"
date: '`r format(Sys.Date(), "%B %d, %Y")`'
output:
html_document:
code_folding: show
---
```{r, shorcut, include=FALSE}
## RStudio keyboard shortcut
# Cursor at the beginning of a command line: Ctrl+A
# Cursor at the end of a command line: Ctrl+E
# Clear all the code from your console: Ctrl+L
# Create a pipe operator %>%: Ctrl+Shift+M (Windows) or Cmd+Shift+M (Mac)
# Create an assignment operator <-: Alt+- (Windows) or Option+-(Mac)
# Knit a document (knitr): Ctrl+Shift+K (Windows) or Cmd+Shift+K (Mac)
# Comment or uncomment current selection: Ctrl+Shift+C (Windows) or Cmd+Shift+C (Mac)
```
# Perform ATACseq quality control using the `{ATACseqQC}`
`{ATACseqQC}` includes two useful metrics - PCR Bottleneck Coefficients (PBC1 and PBC2).
* The `PCRbottleneckCoefficient_1` is calculated as the number of positions in genome with exactly 1 read mapped uniquely compared to the number of positions with at least 1 read. Values less than 0.7 indicate severe bottlenecking, between 0.7 and 0.9 indicate moderate bottlenecking, greater than 0.9 indicate no bottlenecking. Notice that if the sequencing depth is very high, we may get a lower value of this metric.
* The `PCRbottleneckCoefficient_2` is calculated as the number of positions in genome with exactly 1 read mapped uniquely compared to the number of positions with exactly 2 reads mapped uniquely. Values less than 1 indicate severe bottlenecking, between 1 and 3 indicate moderate bottlenecking, greater than 3 indicate no bottlenecking.
```{r}
# BiocManager::install("ATACseqQC")
library(ATACseqQC)
atac_qc <- bamQC("./Sorted_ATAC_female_lung_bowtie2_chr1X.bam")
# Get access to the names of all metrics
names(atac_qc)
# [1] "totalQNAMEs" *"duplicateRate" "mitochondriaRate"
# [4] "properPairRate" "unmappedRate" "hasUnmappedMateRate"
# [7] "notPassingQualityControlsRate" *"nonRedundantFraction" *"PCRbottleneckCoefficient_1"
# [10] *"PCRbottleneckCoefficient_2" *"MAPQ" *"idxstats"
# Extract 2 PCR Bottleneck Coefficients
atac_qc$PCRbottleneckCoefficient_1 # 0.8553605
atac_qc$PCRbottleneckCoefficient_2 # 7.395941
```