PDF file [link to the slide]
Example Data [link to the data]
Note1: The HapMap III raw data are processed by totally following the reference (https://cran.r-project.org/web/packages/plinkQC/vignettes/HapMap.pdf). To easily display the data management process, an addtional data manipulation was conducted.
Note2: Traits are simulated via generalized linear model (GLM) in R.
Note3: there's a reference about the criteria of sample relatedness: Data quality control in genetic case-control association studies. Nat Protoc. 5, 1564–1573 (2010). DOI: 10.1038/nprot.2010.116
source /opt/miniconda3/bin/activate /opt/miniconda3/envs/qy/
Perform quality control for "rawdata" and PCA for population stratification issue. '/home/train2019/qy/' is the path for 'rawdata'; '/home/train2019/$(whoami)/qy/' is the path for storing the output which here is named 'qcdata'. You can change the output filename (here is 'qcdata') whatwver you like.
plink --bfile /home/train2019/qy/rawdata --geno 0.1 --maf 0.01 --mind 0.05 --hwe 0.000001 --make-bed --pca --out /home/train2019/$(whoami)/qy/qcdata