You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -63,6 +67,8 @@ The `matlab/` folder contains the implementation of all functions involved in th
63
67
├── loadings_runners/
64
68
├── loadings_test.go
65
69
├── loadings_test_results/
70
+
├── parglmVS_benchmark/
71
+
├── parglmVS_benchmark.sh
66
72
├── parglmVS_runners/
67
73
├── parglmVS_test.go
68
74
├── pcaEig_runners/
@@ -78,6 +84,8 @@ The `matlab/` folder contains the implementation of all functions involved in th
78
84
├── scores_runners/
79
85
├── scores_test.go
80
86
├── scores_test_results/
87
+
├── vasca_benchmark/
88
+
├── vasca_benchmark.sh
81
89
├── vasca_runners/
82
90
└── vasca_test.go
83
91
```
@@ -88,6 +96,8 @@ In general, all functions return structures with numerical data that can be comp
88
96
89
97
⚠️⚠️ *`parglmVS` is the function responsible for calculating all the data structures in the permutation test. These permutations are random, and for the test, we require a large number of permutations due to the fact that the random number generators in both languages are different. This results in the test taking a considerable amount of time, as discussed in the issues [#51](https://github.com/danieeeld2/vASCA-R/issues/51) and [#52](https://github.com/danieeeld2/vASCA-R/issues/52), as well as in the PR [#30](https://github.com/danieeeld2/vASCA-R/pull/30). At the end of this PR, you can find a screenshot showing the execution time of this test and confirming that the function passes the tests. If you want to reactivate it, simply go to the `parglmVS_test.go` code and comment the two lines that contain `t.Skip`.*
90
98
99
+
In addition to the tests, we provide the scripts `vasca_benchmark.sh` and `parglmVS_benchmark.sh`, which generate comparative performance plots of the R and MATLAB/Octave implementations of these functions. The results are stored in the `<function>_benchmark/` folder. You can find more information about this benchmark in [Section 8](#8--performance-comparison)
100
+
91
101
## 4. 👷🏻 GitHub Workflows
92
102
93
103
The repository includes two GitHub workflows: `docker.yml` and `test.yml`.
@@ -141,7 +151,7 @@ With this, you have a ready-to-use environment with all the code, languages, and
141
151
142
152
## 7. Example Usage
143
153
144
-
We will provide an example of an execution pipeline in each language, using the Docker image `danieeeld2/r-vasca-testing:latest` (or `danieeeld2/r-vasca-testing:r-dependencies-installed` if you want to have already installed R dependencies) and running everything from the root directory of the project. Start by running the image with a volume that includes the project, as instructed in [Section 5](#5-🐳-docker-image).
154
+
We will provide an example of an execution pipeline in each language, using the Docker image `danieeeld2/r-vasca-testing:latest` (or `danieeeld2/r-vasca-testing:r-dependencies-installed` if you want to have already installed R dependencies) and running everything from the root directory of the project. Start by running the image with a volume that includes the project, as instructed in [Section 5](#5--docker-image).
145
155
146
156
```bash
147
157
docker run -it --rm -v "$(pwd):/app" -w /app danieeeld2/r-vasca-testing:latest /bin/bash
@@ -272,3 +282,65 @@ for (i in seq_len(vascao$nFactors)) {
272
282
}
273
283
}
274
284
```
285
+
286
+
## 8. 📈 Performance Comparison
287
+
288
+
In this repository, we have implemented several functions to carry out the VASCA pipeline in R. However, to evaluate the performance of the R implementation compared to its MATLAB/Octave counterpart, we focus on the two main scripts that form the core of the pipeline: `parglmVS`and `vasca`.
289
+
290
+
### 8.1. `parglmVS` Performance
291
+
292
+
The script `tests/parglmVS_benchmark.sh` automates the benchmarking process for the `parglmVS` function implemented in both R and MATLAB/Octave. It runs the function across multiple models (`linear`, `interaction`, and `full`) and a range of permutation values, measuring execution times for each configuration. The results are saved in a CSV file and visualized using several comparative plots, which are generated automatically with R and stored in the `parglmVS_benchmark/` folder. *The datasets used are `X_DATA="../datasets/tests_datasets/X_test.csv"` and `F_DATA="../datasets/tests_datasets/F_test.csv"`*.
Comparison of execution times between R and Octave for the <code>parglmVS</code> function. The left plot shows the raw execution times across different models and number of permutations, while the right plot displays the same results using a log-log scale for better visualization of performance differences at large scales.
301
+
</i></p>
302
+
303
+
<p align="center">
304
+
<img src="./tests/parglmVS_benchmark/benchmark_linear.png" alt="Linear model benchmark" width="32%"/>
305
+
<img src="./tests/parglmVS_benchmark/benchmark_interaction.png" alt="Interaction model benchmark" width="32%"/>
306
+
<img src="./tests/parglmVS_benchmark/benchmark_full.png" alt="Full model benchmark" width="32%"/>
307
+
</p>
308
+
309
+
<p align="center"><i>
310
+
Execution time comparisons for the <code>parglmVS</code> function between R and Octave, separated by model type. Each plot shows how performance varies with the number of permutations for the linear, interaction, and full models, respectively.
311
+
</i></p>
312
+
313
+
<p align="center">
314
+
<img src="./tests/parglmVS_benchmark/benchmark_models_by_permutations.png" alt="Bar chart by model and permutations" width="75%"/>
315
+
</p>
316
+
317
+
<p align="center"><i>
318
+
Bar chart summarizing the execution times of the <code>parglmVS</code> function across different models and permutation counts, grouped by language (R vs. Octave). This visualization helps highlight relative performance differences depending on model complexity and computational load.
319
+
</i></p>
320
+
321
+
As observed in the benchmark plots, the R implementation significantly outperforms the Octave version when computing the test structures for permutation testing across all three model types. This performance gap becomes increasingly pronounced as the number of permutations grows, highlighting the efficiency of the R-based approach in handling larger computational loads.
322
+
323
+
### 8.2. `vasca` Performance
324
+
325
+
The script `tests/vasca_benchmark.sh` automates the benchmarking process for the `vasca` function in both R and MATLAB/Octave. It evaluates performance across multiple datasets and two significance levels (`0.01` and `0.05`), measuring execution times for each configuration. The results are compiled into a CSV file and visualized through a variety of comparative plots, automatically generated with R and saved in the `vasca_benchmark/` directory. *The datasets used for this benchmark are four `.json` files located under `../datasets/tests_datasets/`, named `parglmVS_1.json` to `parglmVS_4.json`.*
<em>The first image compares the execution time of the <code>vasca</code> function between R and Octave across different datasets and significance levels. Each point represents an individual execution time, with lines connecting the results for each language. The second image shows a heatmap visualizing the execution times of the <code>vasca</code> function, categorized by language and dataset, at various significance levels.</em>
<em>Comparison of the execution time of the <code>vasca</code> function in R and Octave across four datasets, with bars grouped by significance level (0.01 and 0.05).</em>
342
+
</p>
343
+
344
+
Although the performance of Octave/MATLAB is better in this case, the `vasca` function requires significantly less computational time compared to `parglmVS`. The performance differences between Octave and R are not as pronounced in this case, but when considering the entire pipeline, R will offer better overall performance. This is because the `parglmVS` function, which is the most computationally intensive part of the pipeline, is much more optimized in R, resulting in a faster overall execution when running the entire workflow in R.
echo"Running R with Model=$model, Permutations=$perm..."
43
+
TIME=$( { time Rscript $R_SCRIPT$X_DATA$F_DATA Model $model Permutations $perm>/dev/null 2>&1; } 2>&1| grep real | awk '{print $2}'| sed 's/m/*60+/g'| sed 's/s//g'| bc)
44
+
echo"R,$model,$perm,$TIME">>$OUTPUT_FILE
45
+
# Remove any CSV files generated by R script
46
+
find . -name "parglmVS_*.csv" -type f -delete
47
+
done
48
+
done
49
+
50
+
# Benchmark for Octave (MATLAB compatible)
51
+
formodelin"${MODELS[@]}";do
52
+
forpermin"${PERMS[@]}";do
53
+
echo"Running Octave with Model=$model, Permutations=$perm..."
54
+
TIME=$( { time octave --no-gui -q $OCTAVE_SCRIPT$X_DATA$F_DATA Model $model Permutations $perm>/dev/null 2>&1; } 2>&1| grep real | awk '{print $2}'| sed 's/m/*60+/g'| sed 's/s//g'| bc)
55
+
echo"Octave,$model,$perm,$TIME">>$OUTPUT_FILE
56
+
# Remove any CSV files generated by Octave script
57
+
find . -name "parglmVS_*.csv" -type f -delete
58
+
done
59
+
done
60
+
61
+
echo"Benchmark completed. Results in $OUTPUT_FILE"
0 commit comments