You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`fast.ssgsea` is an R package [@R-core-team] for fast Single-Sample Gene Set Enrichment Analysis (ssGSEA) and Post-Translational Modification Signature Enrichment Analysis (PTM-SEA) [@barbie-systematic-2009; @krug-curated-2019].
24
+
`fast.ssgsea` is an R package [@R-core-team] for fast gene permutation Gene Set Enrichment Analysis (GSEA) and Post-Translational Modification Signature Enrichment Analysis (PTM-SEA) [@subramanian-gene-2005; @krug-curated-2019].
26
25
27
-
The primary function, `fast_ssgsea`, accepts a numeric matrix with genes or other molecules as rows and either samples, contrasts, or some other meaningful representation of the data as columns. A named list of gene sets (more generally, molecular signatures) is also required. Other arguments control the behavior of ssGSEA/PTM-SEA, and they are described in the function documentation.
26
+
**NOTE:** Support for directional databases, such as PTMsigDB, is broken starting with version 0.1.0.9018. Until this is fixed, PTM-SEA is not supported.
27
+
28
+
The primary function, `fast_ssgsea`, accepts a numeric matrix with genes or other molecules as rows and either samples, contrasts, or some other meaningful representation of the data as columns. A named list of gene sets (more generally, molecular signatures) is also required. Other arguments control the behavior of GSEA/PTM-SEA, and they are described in the function documentation.
28
29
29
30
The package also contains a `read_gmt` function, which reads a Gene Matrix Transposed (GMT) file to construct a named list of gene sets for use with `fast_ssgsea`.
30
31
@@ -56,11 +57,11 @@ pak::pak("pnnl/fast.ssgsea")
56
57
57
58
### Simulate Data
58
59
59
-
We will simulate a matrix with 10,000 genes as rows and 100 samples as columns. Then, we generate 20,000 gene sets by randomly sampling between 10 and 500 genes from the matrix row names.
60
+
We will simulate a matrix with 10,000 genes as rows and one column. Then, we generate 20,000 gene sets by randomly sampling between 5 and 1,000 genes.
This shows the runtime of `fast_ssgsea` running on an AMD Ryzen 5 7600X CPU with a clock speed of 4.7 GHz.
95
+
This shows the runtime of `fast_ssgsea` running on an AMD Ryzen 5 7600X CPU with a clock speed of 4.7 GHz. A total of 10,000 permutations were used to calculate p-values and normalized enrichment scores (NES).
The `fast.ssgsea` R package utilizes linear algebra and ideas from Fast Gene Set Enrichment Analysis [@korotkevich-fast-2021] to greatly reduce the runtime of gene permutation GSEA and PTM-SEA.
124
+
The `fast.ssgsea` R package utilizes linear algebra and ideas from Fast Gene Set Enrichment Analysis [@korotkevich-fast-2021] to greatly reduce the runtime.
127
125
128
-
Tests were performed on a desktop computer with an AMD Ryzen 5 7600X CPU (6 cores, 12 threads) at 4.7 GHz. Different combinations of the number of samples, gene sets, maximum gene set size, number of permutations, and value of the $\alpha$ parameter (the weighting exponent) were tested in a random order (3 replicates each) to minimize the influence of previous runs.
126
+
Tests were performed on a desktop computer with an AMD Ryzen 5 7600X CPU (6 cores, 12 threads) at 4.7 GHz. Different combinations of the number of gene sets, maximum gene set size, number of permutations, and value of the $\alpha$ parameter (the weighting exponent) were tested in a random order (3 replicates each) to minimize the influence of previous runs.
129
127
130
128
```{r, echo=FALSE}
131
-
fig_cap <- "Runtime of fast_ssgsea with A) 1,000or B) 10,000 permutations."
129
+
fig_cap <- "Runtime of fast_ssgsea with A) 10,000, B) 100,000, or C) 1,000,000 permutations."
author = {Barbie, David A. and Tamayo, Pablo and Boehm, Jesse S. and Kim, So Young and Moody, Susan E. and Dunn, Ian F. and Schinzel, Anna C. and Sandy, Peter and Meylan, Etienne and Scholl, Claudia and Fröhling, Stefan and Chan, Edmond M. and Sos, Martin L. and Michel, Kathrin and Mermel, Craig and Silver, Serena J. and Weir, Barbara A. and Reiling, Jan H. and Sheng, Qing and Gupta, Piyush B. and Wadlow, Raymond C. and Le, Hanh and Hoersch, Sebastian and Wittner, Ben S. and Ramaswamy, Sridhar and Livingston, David M. and Sabatini, David M. and Meyerson, Matthew and Thomas, Roman K. and Lander, Eric S. and Mesirov, Jill P. and Root, David E. and Gilliland, D. Gary and Jacks, Tyler and Hahn, William C.},
13
-
month = nov,
14
-
year = {2009},
15
-
pages = {108--112},
11
+
journal = {Proceedings of the National Academy of Sciences},
12
+
author = {Subramanian, Aravind and Tamayo, Pablo and Mootha, Vamsi K. and Mukherjee, Sayan and Ebert, Benjamin L. and Gillette, Michael A. and Paulovich, Amanda and Pomeroy, Scott L. and Golub, Todd R. and Lander, Eric S. and Mesirov, Jill P.},
13
+
month = oct,
14
+
year = {2005},
15
+
pages = {15545--15550},
16
16
}
17
17
18
18
@article{krug-curated-2019,
@@ -52,48 +52,3 @@ @Manual{R-core-team
52
52
year = {2024},
53
53
url = {https://www.R-project.org/},
54
54
}
55
-
56
-
@inproceedings{openblas-1,
57
-
author={Xianyi, Zhang and Qian, Wang and Yunquan, Zhang},
58
-
booktitle={2012 IEEE 18th International Conference on Parallel and Distributed Systems},
59
-
title={Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor},
60
-
year={2012},
61
-
volume={},
62
-
number={},
63
-
pages={684-691},
64
-
doi={10.1109/ICPADS.2012.97},
65
-
}
66
-
67
-
@inproceedings{openblas-2,
68
-
author = {Wang, Qian and Zhang, Xianyi and Zhang, Yunquan and Yi, Qing},
69
-
title = {AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs},
70
-
year = {2013},
71
-
isbn = {9781450323789},
72
-
publisher = {Association for Computing Machinery},
73
-
address = {New York, NY, USA},
74
-
url = {https://doi.org/10.1145/2503210.2503219},
75
-
doi = {10.1145/2503210.2503219},
76
-
booktitle = {Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis},
77
-
articleno = {25},
78
-
numpages = {12},
79
-
location = {Denver, Colorado},
80
-
series = {SC '13},
81
-
}
82
-
83
-
@article{blas,
84
-
author = {Lawson, C. L. and Hanson, R. J. and Kincaid, D. R. and Krogh, F. T.},
85
-
title = {Basic Linear Algebra Subprograms for {Fortran} Usage},
86
-
year = {1979},
87
-
issue_date = {Sept. 1979},
88
-
publisher = {Association for Computing Machinery},
0 commit comments