Skip to content

Commit 45c763d

Browse files
Update benchmark results and README
1 parent 97f4e4f commit 45c763d

File tree

12 files changed

+215
-153
lines changed

12 files changed

+215
-153
lines changed

README.Rmd

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -119,16 +119,30 @@ print(sessionInfo(), locale = FALSE, tzone = FALSE)
119119

120120
## Performance
121121

122-
The `fast.ssgsea` R package utilizes linear algebra and ideas from Fast Gene Set Enrichment Analysis [@korotkevich-fast-2021] to greatly reduce the runtime.
122+
### fast-ssGSEA
123123

124-
Tests were performed on a desktop computer with an AMD Ryzen 5 7600X CPU (6 cores, 12 threads) at 4.7 GHz. Different combinations of the number of gene sets, maximum gene set size, number of permutations, and value of the $\alpha$ parameter (the weighting exponent) were tested in a random order (3 replicates each) to minimize the influence of previous runs.
124+
The `fast.ssgsea` R package utilizes linear algebra and ideas from Fast Gene Set Enrichment Analysis (FGSEA) [@korotkevich-fast-2021] to greatly reduce the runtime.
125+
126+
Tests were performed on a desktop computer with an AMD Ryzen 5 7600X CPU running at 4.7 GHz, single threaded. Different combinations of the number of gene sets, maximum gene set size, and the number of permutations were tested in a random order (3 replicates each) to minimize the influence of previous runs. The R scripts and data are available in the simulation/ folder.
125127

126128
```{r, echo=FALSE}
127-
fig_cap <- "Runtime of fast_ssgsea with A) 10,000, B) 100,000, or C) 1,000,000 permutations."
129+
fig1_cap <- "Runtime of fast_ssgsea with A) 10,000, B) 100,000, or C) 1,000,000 permutations."
128130
```
129131

130-
```{r, echo=FALSE, fig.cap=fig_cap}
132+
```{r, echo=FALSE, fig.cap=fig1_cap}
131133
knitr::include_graphics("./man/figures/README-figure-1.png")
132134
```
133135

136+
### FGSEA-simple
137+
138+
The same tests were also carried out using the simple implementation of FGSEA (`fgsea::fgseaSimple`). Like fast-ssGSEA, FGSEA-simple relies purely on the number of permutations to calculate p-values, which limits how small they can become. While FGSEA-simple is instead meant to be run with a smaller number of permutations and followed up with FGSEA-multilevel (the method capable of calculating arbitrarily small p-values), these results serve to illustrate the extreme difference in runtime between the two approaches. This difference is largely the result of changes to how the ES is defined.
139+
140+
```{r, echo=FALSE}
141+
fig2_cap <- "Runtime of fgsea::fgseaSimple with A) 10,000, B) 100,000, or C) 1,000,000 permutations."
142+
```
143+
144+
```{r, echo=FALSE, fig.cap=fig2_cap}
145+
knitr::include_graphics("./man/figures/README-figure-2.png")
146+
```
147+
134148
## References

README.md

Lines changed: 53 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@
99
- [Runtime and Results](#runtime-and-results)
1010
- [Session Information](#session-information)
1111
- [Performance](#performance)
12+
- [fast-ssGSEA](#fast-ssgsea)
13+
- [FGSEA-simple](#fgsea-simple)
1214
- [References](#references)
1315

1416
# fast.ssgsea
@@ -130,22 +132,22 @@ system.time({
130132
```
131133

132134
## user system elapsed
133-
## 4.137 0.962 4.650
135+
## 2.245 0.377 2.376
134136

135137
``` r
136138
str(res)
137139
```
138140

139141
## 'data.frame': 20000 obs. of 9 variables:
140142
## $ sample : Factor w/ 1 level "sample1": 1 1 1 1 1 1 1 1 1 1 ...
141-
## $ set : chr "set18791" "set16136" "set19084" "set2830" ...
142-
## $ set_size : int 138 801 841 163 706 749 450 87 161 761 ...
143-
## $ ES : num -1866 709 698 1584 759 ...
144-
## $ NES : num -5.3 4.65 4.68 4.76 4.68 ...
145-
## $ n_same_sign : int 49042 52788 52782 50951 52785 47193 47813 50722 48979 47243 ...
146-
## $ n_as_extreme: int 1 8 8 9 11 10 13 14 18 20 ...
147-
## $ p_value : num 4.08e-05 1.70e-04 1.71e-04 1.96e-04 2.27e-04 ...
148-
## $ adj_p_value : num 0.739 0.739 0.739 0.739 0.739 ...
143+
## $ set : chr "set18791" "set2830" "set19084" "set18223" ...
144+
## $ set_size : int 138 163 841 706 801 87 503 409 320 450 ...
145+
## $ ES : num -1866 1584 698 759 709 ...
146+
## $ NES : num -5.34 4.78 4.66 4.67 4.62 ...
147+
## $ n_same_sign : int 49235 51107 52907 52784 52813 50461 52351 51847 51728 47859 ...
148+
## $ n_as_extreme: int 1 3 8 9 12 12 16 19 19 19 ...
149+
## $ p_value : num 4.06e-05 7.83e-05 1.70e-04 1.89e-04 2.46e-04 ...
150+
## $ adj_p_value : num 0.783 0.783 0.836 0.836 0.836 ...
149151

150152
### Session Information
151153

@@ -158,40 +160,42 @@ print(sessionInfo(), locale = FALSE, tzone = FALSE)
158160
## Running under: Linux Mint 22.1
159161
##
160162
## Matrix products: default
161-
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
162-
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
163+
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
164+
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
163165
##
164166
## attached base packages:
165167
## [1] stats graphics grDevices utils datasets methods base
166168
##
167169
## other attached packages:
168-
## [1] dqrng_0.4.1 fast.ssgsea_0.1.0.9022
170+
## [1] dqrng_0.4.1 fast.ssgsea_0.1.0.9025
169171
##
170172
## loaded via a namespace (and not attached):
171-
## [1] digest_0.6.37 RcppArmadillo_15.0.2-2 fastmap_1.2.0
172-
## [4] xfun_0.53 Matrix_1.7-4 lattice_0.22-7
173-
## [7] knitr_1.50 htmltools_0.5.8.1 rmarkdown_2.29
174-
## [10] cli_3.6.5 grid_4.5.2 data.table_1.17.8
175-
## [13] compiler_4.5.2 rstudioapi_0.17.1 tools_4.5.2
176-
## [16] evaluate_1.0.5 Rcpp_1.1.0 yaml_2.3.10
177-
## [19] rlang_1.1.6
173+
## [1] digest_0.6.37 RcppArmadillo_15.0.2-2 collapse_2.1.3
174+
## [4] fastmap_1.2.0 xfun_0.53 Matrix_1.7-4
175+
## [7] lattice_0.22-7 parallel_4.5.2 knitr_1.50
176+
## [10] htmltools_0.5.8.1 rmarkdown_2.29 cli_3.6.5
177+
## [13] grid_4.5.2 data.table_1.17.8 compiler_4.5.2
178+
## [16] rstudioapi_0.17.1 tools_4.5.2 evaluate_1.0.5
179+
## [19] Rcpp_1.1.0 yaml_2.3.10 rlang_1.1.6
178180

179181
## Performance
180182

183+
### fast-ssGSEA
184+
181185
The `fast.ssgsea` R package utilizes linear algebra and ideas from Fast
182-
Gene Set Enrichment Analysis ([Korotkevich et al.
186+
Gene Set Enrichment Analysis (FGSEA) ([Korotkevich et al.
183187
2021](#ref-korotkevich-fast-2021)) to greatly reduce the runtime.
184188

185189
Tests were performed on a desktop computer with an AMD Ryzen 5 7600X CPU
186-
(6 cores, 12 threads) at 4.7 GHz. Different combinations of the number
187-
of gene sets, maximum gene set size, number of permutations, and value
188-
of the $\alpha$ parameter (the weighting exponent) were tested in a
189-
random order (3 replicates each) to minimize the influence of previous
190-
runs.
190+
running at 4.7 GHz, single threaded. Different combinations of the
191+
number of gene sets, maximum gene set size, and the number of
192+
permutations were tested in a random order (3 replicates each) to
193+
minimize the influence of previous runs. The R scripts and data are
194+
available in the simulation/ folder.
191195

192196
<div class="figure" style="text-align: center">
193197

194-
<img src="./man/figures/README-figure-1.png" alt="Runtime of fast_ssgsea with A) 10,000, B) 100,000, or C) 1,000,000 permutations." width="648" />
198+
<img src="./man/figures/README-figure-1.png" alt="Runtime of fast_ssgsea with A) 10,000, B) 100,000, or C) 1,000,000 permutations." width="864" />
195199
<p class="caption">
196200

197201
Runtime of fast_ssgsea with A) 10,000, B) 100,000, or C) 1,000,000
@@ -200,6 +204,29 @@ permutations.
200204

201205
</div>
202206

207+
### FGSEA-simple
208+
209+
The same tests were also carried out using the simple implementation of
210+
FGSEA (`fgsea::fgseaSimple`). Like fast-ssGSEA, FGSEA-simple relies
211+
purely on the number of permutations to calculate p-values, which limits
212+
how small they can become. While FGSEA-simple is instead meant to be run
213+
with a smaller number of permutations and followed up with
214+
FGSEA-multilevel (the method capable of calculating arbitrarily small
215+
p-values), these results serve to illustrate the extreme difference in
216+
runtime between the two approaches. This difference is largely the
217+
result of changes to how the ES is defined.
218+
219+
<div class="figure" style="text-align: center">
220+
221+
<img src="./man/figures/README-figure-2.png" alt="Runtime of fgsea::fgseaSimple with A) 10,000, B) 100,000, or C) 1,000,000 permutations." width="864" />
222+
<p class="caption">
223+
224+
Runtime of fgsea::fgseaSimple with A) 10,000, B) 100,000, or C)
225+
1,000,000 permutations.
226+
</p>
227+
228+
</div>
229+
203230
## References
204231

205232
<div id="refs" class="references csl-bib-body hanging-indent"

man/figures/README-figure-1.png

-104 KB
Loading

man/figures/README-figure-2.png

110 KB
Loading
-408 Bytes
Binary file not shown.
-412 Bytes
Binary file not shown.
-419 Bytes
Binary file not shown.

simulation/figures/figure-1.pdf

-1.56 KB
Binary file not shown.

simulation/figures/figure-2.pdf

6.84 KB
Binary file not shown.

0 commit comments

Comments
 (0)