Post-hoc analysis for scRNA-seq and scATAC-seq Data Analysis DREAM Challenge
-
Clone the repo
git clone https://github.com/Sage-Bionetworks-Challenges/Multi-seq-Data-Analysis-Post-Analysis cd Multi-seq-Data-Analysis-Post-Analysis -
Create a conda environment using python 3.9:
conda create --name synapse python=3.9 -y conda activate synapse -
Install Python dependencies
python -m pip install challengeutils==4.2.0check if
synapseclientandchallengeutilsare installed via:synapse --version challengeutils -v -
Install R dependencies
R -e 'source("install.R")'Note:
The task 2 analysis usesbedrpackage that has two requisitions - bedpos and tabix needed to be installed as well. -
Set up Synapse credentials via CLI, or manually store the credentials to
~/.synapseConfig- see details here synapse login --rememberMe
Download all final submission results and each individual test case's scores to data/ folder:
Rscript submission/get_submissions.R
final_submissions_{task}.rds: Esseential information of final submission, e.g submission id, team, ranksfinal_scores_{task}.rds: All test case scores from final submissions, consists of test case name, scores of primary and secondary metrics
Download output files (imputed gene expression / called peaks) of all final submissions to data/model_output/
# replace {task} with 'task1' or 'task2'
Rscript submission/get_predictions_{task}.R
Warning For Task 1, the output (imputation) of each submission has large size ~30G. Please be aware of the available disk space.
Report statistics about submissions
Rscript -e 'rmarkdown::render("stats/get_submission_stats.rmd")'