PRScope

Outline

PRScope automatically generates all the polygenic scores (PGS) associated with selected ontology IDs (e.g., Experimental Factor Ontology, EFO) for a given genotype dataset.

Inputs

Ontology IDs defining the PGS of interest
Genotype data to calculate the PGS

Output

A dataset (multi-PGS matrix) containing:
Subject IDs from the genotype data
Values of all selected PGS

The setup is optimized for a minimal-effort "vanilla use case" but supports advanced configurations.

Repo Contents

config: config files for parameter specification.
input: genotype-, reference files and PRS trait specification.
main: contains pipeline and scripts for PRS calculation, biotype identification and tools.
output: intermediate and final results are saved here.

System Requirements

Hardware Requirements

PRScope requires only a standard computer with enough RAM (4GB) to support the in-memory operations, but for best performance, we suggest a computer with higher specifications:

RAM: 16+ GB
CPU: 4+ cores, 3.3+ GHz/core

The runtimes below are generated using a computer with the recommended specs (16 GB RAM, 4 cores each 3.3 GHz) and internet of speed 100 Mbps.

Software Requirements

PRScope requires the following:

conda
Python
R
Snakemake
PLINK^[1]
PRSice^[2] or LDpred-2^[3]
liftOverPlink^[4]

Only conda must be installed manually. All other dependencies are managed via the Conda environment.

PRScope has been tested on the Ubuntu 22.04.5 and requires a Linux system.

Setup

1. Setting Up

a. Cloning the Repository

git clone https://github.com/transbioZI/PRScope

b. Download the required files from the provided link

Reference files for PRScope

tar -xf reference.tar.gz

Place them into this folder input/reference/:

The folder should now include:
- ldpred2_ref/
- eur_hg38.phase3.bed
- eur_hg38.phase3.bim
- eur_hg38.phase3.fam
- eur_hg38.phase3.frq

Installation Instructions

1. Install Conda

Conda Installation Guide

2. Create the Conda Environment

conda create -c conda-forge -c bioconda -n snakemake snakemake python=3.12.1

Environment name: snakemake
Wait for installation to complete (~15 minutes)

3. Activate the Environment

conda activate snakemake

Running PRScope (please see Demo below)

cd PRScope
./run.sh

This command initiates the PRScope pipeline.

Wait for installation to complete conda environment
May take up to an hour

PRScope tested with

conda version : 23.1.0
snakemake version : 8.4.8
python : 3.12.1

R-sessionInfo()

R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0

locale:
LC_CTYPE=C.UTF-8, LC_NUMERIC=C, LC_TIME=C.UTF-8, LC_COLLATE=C.UTF-8, LC_MONETARY=C.UTF-8, LC_MESSAGES=C.UTF-8, LC_PAPER=C.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=C.UTF-8, LC_IDENTIFICATION=C

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
parallel, stats, graphics, grDevices, utils, datasets, methods, base

other attached packages:
reshape2_1.4.4, cluster_2.1.7, xgboost_1.7.8.1, bigutilsr_0.3.4, reshape_0.8.9, ggpubr_0.6.0,doParallel_1.0.17, iterators_1.0.14, foreach_1.5.2, glmnet_4.1-8, Matrix_1.7-1, lubridate_1.9.4, forcats_1.0.0, purrr_1.0.2, readr_2.1.5, tidyr_1.3.1, tibble_3.2.1, tidyverse_2.0.0, data.table_1.16.4, dplyr_1.1.4, gwasrapidd_0.99.17, caret_7.0-1, lattice_0.22-6, ranger_0.17.0, stringr_1.5.1, ggplot2_3.5.1, fmsb_0.7.6, optparse_1.7.5, tidyselect_1.2.1, timeDate_4041.110, bigassertr_0.1.6, pROC_1.18.5, digest_0.6.37, rpart_4.1.23,timechange_0.3.0, lifecycle_1.0.4, survival_3.7-0, magrittr_2.0.3, compiler_4.4.1, rlang_1.1.4, tools_4.4.1, utf8_1.2.4, ggsignif_0.6.4, plyr_1.8.9, abind_1.4-8, withr_3.0.2, nnet_7.3-19, grid_4.4.1, stats4_4.4.1, fansi_1.0.6, colorspace_2.1-1, future_1.34.0, globals_0.16.3, scales_1.3.0, MASS_7.3-61, cli_3.6.3, generics_0.1.3, RSpectra_0.16-2, rstudioapi_0.17.1, future.apply_1.11.3, tzdb_0.4.0, getopt_1.20.4, splines_4.4.1, vctrs_0.6.5, hardhat_1.4.0, jsonlite_1.8.9, carData_3.0-5, car_3.1-3, hms_1.1.3, rstatix_0.7.2, Formula_1.2-5, listenv_0.9.1, gower_1.0.1, recipes_1.1.0, glue_1.8.0,parallelly_1.40.1, codetools_0.2-20, stringi_1.8.4, gtable_0.3.6, shape_1.4.6.1, munsell_0.5.1, pillar_1.9.0, ipred_0.9-15, lava_1.8.0, R6_2.5.1, backports_1.5.0, broom_1.0.7, class_7.3-22, Rcpp_1.0.13-1, nlme_3.1-166, prodlim_2024.06.25, ModelMetrics_1.2.2.2, pkgconfig_2.0.3

Pipeline Description

The following pipelines can be found in the main/snakefiles/ directory:

find_sumstats.snakefile – Selection of summary statistics for specified EFO IDs
qc_sumstats.snakefile – Quality control of the selected summary statistics
qc_genotype_with_liftover.snakefile – Quality control of genotype data with liftover
qc_genotype.snakefile – Quality control of genotype data
prs_calculation_prsice.snakefile – For PRS calculation using PRSice
prs_calculation_ldpred.snakefile – For PRS calculation using LDpred
ldsc_heritability_calculation.snakefile – Heritability calculation

Demo

a. Navigate to the repository path

cd PRScope

b. Folder Structure

config/ – For advanced parameter customization
input/ – The only folder requiring user modifications
main/ – Contains the main pipeline
output/ – Will contain output after pipeline execution

c. Preparing the Input

Navigate to input/
Edit efo_ids.txt:
- Default content: EFO_0003898
- Replace with your own EFO IDs as needed
In input/genotype/, you’ll find:
- EUR.bed
- EUR.bim
- EUR.fam
(Replace with your genotype data if desired)
Running PRScope

cd PRScope
./run.sh

Expected Outcome

output/gwas_list/gwas_search.txt - GWAS list meeting the criteria for use in PGS calculation.
output/qced_gwas/GCST* - GWAS in gwas_search.txt, downloaded and preprocessed, ready for PGS calculation.
output/qced_genotype/corrected[hg19,hg38] - EUR.FINAL.* - preprocessed version of the simulated genotype data in input/genotype/.
output/calculated_pgs_prsice/ all calculated PGS of GWAS in the list gwas_search.txt as data table pgs_datatable_prsice_100.tsv. The suffix _100 means, the min number of SNPs used to calculate a PGS.

References

[1] HannahVMeyer. Meyer-Lab-cshl/plinkQC: plinkQC 0.3.2. (Zenodo, 2020). 976 doi:10.5281/ZENODO.3934294.

[2] Choi, S. W. & O’Reilly, P. F. PRSice-2: Polygenic Risk Score software for biobank-scale 906 data. GigaScience 8, (2019).

[3] Privé, F., Arbel, J. & Vilhjálmsson, B. J. LDpred2: better, faster, stronger. Bioinformatics 913 36, 5424–5431 (2021).

[4] https://github.com/sritchie73/liftOverPlink

Citation

For usage of the PRScope and associated manuscript, please cite according to the enclosed citation.bib.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PRScope

Contents

Outline

Inputs

Output

Repo Contents

System Requirements

Hardware Requirements

Software Requirements

Setup

1. Setting Up

a. Cloning the Repository

b. Download the required files from the provided link

Installation Instructions

1. Install Conda

2. Create the Conda Environment

3. Activate the Environment

Running PRScope (please see Demo below)

PRScope tested with

R-sessionInfo()

Pipeline Description

Demo

a. Navigate to the repository path

b. Folder Structure

c. Preparing the Input

References

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
config		config
input		input
main		main
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
citation.bib		citation.bib
run.sh		run.sh

License

transbioZI/PRScope

Folders and files

Latest commit

History

Repository files navigation

PRScope

Contents

Outline

Inputs

Output

Repo Contents

System Requirements

Hardware Requirements

Software Requirements

Setup

1. Setting Up

a. Cloning the Repository

b. Download the required files from the provided link

Installation Instructions

1. Install Conda

2. Create the Conda Environment

3. Activate the Environment

Running PRScope (please see Demo below)

PRScope tested with

R-sessionInfo()

Pipeline Description

Demo

a. Navigate to the repository path

b. Folder Structure

c. Preparing the Input

References

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages