Skip to content

Latest commit

 

History

History
63 lines (37 loc) · 3.45 KB

File metadata and controls

63 lines (37 loc) · 3.45 KB

Geneset tests

Usage

Run one of a selection of statistical tests for gene set enrichment on a matrix of gene weights.

Quick start

  1. clone the repo and its submodule git clone --recurse-submodules https://github.com/perslab/19-BMI-brain-genesettests.git
  2. go to the directory: cd 19-BMI-brain-genesettests
  3. start R session: R
  4. install renv: ìnstall.packages("renv")
  5. install R package dependencies for the code: renv::restore() or renv::hydrate(). If this fails, install the packages manually using the R command ìnstall.packages(). See the full list of packages as well as R version under 'R session info' below.
  6. quit R: quit("no")
  7. add paths to the data, test and any other parameters in call_run_geneset_tests_celltypes_vs_BMI.sh
  8. run the analysis: bash call_run_geneset_tests_celltypes_vs_BMI.sh

Input

Takes a gene x annotation table of weights, where the first column contains gene names and column names are celltype or similar.

Output

csv table with output depending on statistical test

Run cell type versus rare/mendelian variant tests

To reproduce the results from the paper, adjust the file path parameters in calL_run_geneset_tests_celltypes_vs_BMI.sh and leave the other parameters as they are, then bash call_run_geneset_tests_celltypes_vs_BMI.sh

Run cell type versus WGCNA module tests

Adjust the file path parameters in calL_run_geneset_tests_celltypes_vs_modules.sh and leave the other parameters as they are, then bash call_run_geneset_tests_celltypes_vs_modules.sh

Arguments

Rscript ./code/run_geneset_tests.R --help

Note

  • empirical p-values can be impractically slow especially for the t.test and wilcox.test
  • the GSEA function uses the liger package, which computes p-values by permuting gene labels on the input weights. This is faster than the original GSEA algorithm but may introduce false positives when testing against genesets where some genes are co-expressed.

R session info

R version 3.5.3 (2019-03-11)

Platform: x86_64-pc-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C

attached base packages: parallel, stats, graphics, grDevices, datasets, utils, methods and base

other attached packages: pander(v.0.6.3), liger(v.1.0), here(v.0.1), optparse(v.1.6.2), Matrix(v.1.2-15), usethis(v.1.4.0), devtools(v.2.0.1), magrittr(v.1.5) and workflowr(v.1.4.0)

loaded via a namespace (and not attached): Rcpp(v.1.0.2), compiler(v.3.5.3), prettyunits(v.1.0.2), base64enc(v.0.1-3), remotes(v.2.0.2), tools(v.3.5.3), digest(v.0.6.20), pkgbuild(v.1.0.2), pkgload(v.1.0.2), evaluate(v.0.14), memoise(v.1.1.0), lattice(v.0.20-38), rlang(v.0.3.3), cli(v.1.1.0), xfun(v.0.8), withr(v.2.1.2), knitr(v.1.24), desc(v.1.2.0), fs(v.1.2.6), rprojroot(v.1.3-2), grid(v.3.5.3), getopt(v.1.20.3), glue(v.1.3.1), R6(v.2.4.0), processx(v.3.2.0), rmarkdown(v.1.14), sessioninfo(v.1.1.1), callr(v.3.0.0), backports(v.1.1.2), ps(v.1.2.1), htmltools(v.0.3.6), assertthat(v.0.2.1), renv(v.0.6.0-108) and crayon(v.1.3.4)> plibrary("pander")

A workflowr project.