You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/README.md
+41-26Lines changed: 41 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,39 +5,38 @@ Note that there is also a brief section on reproducibility in the appendix of th
5
5
## Computational Environment
6
6
7
7
In order to reproduce the results, you can either use the provided docker images or recreate the `renv` environment that is described in `paper/renv.lock`.
8
-
To work with the renv environment, go into the `paper` directory, which will bootstrap the environment, and then run:
8
+
To work with the renv environment, go into the `paper` directory and start an interactive R session, which will bootstrap the `renv` package.
9
+
Then, restore the environment by running:
9
10
10
11
```r
11
12
renv::restore()
12
13
```
13
14
14
-
Afterwards, you need to install torch:
15
+
It will prompt ask you whether you want to proceed installing the missing packages, which you have to confirm.
16
+
Afterwards, you need to install torch via:
15
17
16
-
```{r}
18
+
```r
17
19
torch::install_torch()
18
20
```
19
21
20
22
We are providing two docker images, one for CPU and one for CUDA GPU that have the same packages from the `renv.lock` file installed.
21
-
The images can be downloaded from Zenodo: https://doi.org/10.5281/zenodo.17130368.
22
-
You can, for example, use the [zenodo_client](https://pypi.org/project/zenodo-client/) library to download the images:
23
+
The images can be downloaded from Zenodo: [https://doi.org/10.5281/zenodo.17130368](https://doi.org/10.5281/zenodo.17130368), either via the web interface, or, for example, using `wget` or a similar tool:
The `Dockerfile`s used to create the images are available in the `./paper/envs` directory.
36
35
37
-
If you have downloaded the images like shown above, you can load them into Docker, e.g. via the command below (or otherwise adjust the path accordingly).
36
+
When downloading the image from zenodo, you can register them with docker as follows:
The CUDA image can be started with the command below, which requires the [nvidia extension](https://docs.nvidia.com/ai-enterprise/deployment/vmware/latest/docker.html).
53
52
54
53
```bash
55
-
docker run -it --gpus all --rm -v ../:/mnt/data sebffischer/mlr3torch-jss:gpu
54
+
docker run -it --gpus all --rm -v <parent-dir-to-paper>:/mnt/data sebffischer/mlr3torch-jss:gpu
56
55
cd /mnt/data/paper
57
56
```
58
57
59
-
Note that the `.Rprofile` file ensures that when running R programs from the `paper` directory, the renv environment will be used unless the code is run in the docker container, where we are not relying on renv directly.
58
+
Note that the `.Rprofile` file in `paper`ensures that when running R programs from the `paper` directory, the renv environment will be used unless the code is run in the docker container, where we are not relying on renv directly.
60
59
61
60
## Running the Benchmark
62
61
@@ -82,7 +81,19 @@ Also note that it's important to have enough RAM, otherwise the benchmarks will
82
81
However, there are many other factors, such as the exact hardware that make it generally difficult to reproduce the runtime results.
83
82
84
83
To run the benchmarks locally, ensure that you are in the `paper` directory.
85
-
To run the GPU benchmarks (using the CUDA docker image) on linux, run:
84
+
There are three scripts:
85
+
86
+
*`paper/benchmark/linux-gpu.R`, which creates the folder `paper/benchmark/registry-linux-gpu`
87
+
*`paper/benchmark/linux-cpu.R`, which creates the folder `paper/benchmark/registry-linux-cpu`
88
+
*`paper/benchmark/linux-gpu-optimizer.R`, which creates the folder `paper/benchmark/registry-linux-gpu-optimizer`
89
+
90
+
**Important**: If one of the folders already exists and you want to re-run the benchmarks, you need to delete or move the folder, otherwise you will get an error.
91
+
This is to ensure that the benchmark results are not accidentally overwritten.
92
+
93
+
To run the benchmarks, either start it via Rscript or source it interactively.
94
+
If you source it interactively and the registry folder already exists, it will ask you whether you want to delete it, which you have to confirm.
95
+
96
+
Below is the command for the GPU benchmark which needs to be run within the CUDA docker image.
86
97
87
98
```bash
88
99
Rscript benchmark/linux-gpu.R
@@ -100,7 +111,7 @@ To run the benchmark that compares "ignite" with standard optimizers (using the
100
111
Rscript benchmark/linux-gpu-optimizer.R
101
112
```
102
113
103
-
The results are stored in:
114
+
The postprocessd results are stored in:
104
115
105
116
*`paper/benchmark/result-linux-gpu.rds`
106
117
*`paper/benchmark/result-linux-cpu.rds`
@@ -150,21 +161,18 @@ We provide the results of running this in `paper/paper_results`.
150
161
The results in the paper are those from the CPU docker image and they were fully reproducible when we re-ran them on the same machine.
151
162
There were some minor differences in results when re-running the code on a different machine (macOS with M1 CPU vs Linux with Intel CPU).
152
163
153
-
The file `paper_code.R` contains some very minor differences to the paper we omitted in the paper for brevity.
164
+
The file `paper_code.R` contains some very minor differences to the paper, which we omitted in the paper for brevity.
154
165
It was extracted from the tex manuscript almost fully programmatically but adjusted with the following modifications:
155
166
156
167
* Time measurements (`Sys.time()`)
157
168
* Deactivate knitr caching
158
169
* Activating caching for `mlr3torch`
159
170
* Changing the `mlr3` logging level to `warn` for cleaner output
160
-
*Saving the ROC plot for postprocessing
171
+
*Processing the ROC plot for better readability and saving it as `roc.png`, as well as printing it.
161
172
* Adding a `sessionInfo()` call at the end
162
173
163
174
We also added some additional comments to make it easier to associate the code with the paper.
164
175
165
-
The results we obtained via `knitr::spin()` are stored in `paper/paper_results/`
166
-
The ROC plot is postprocessed using the `roc.R` script and we have also provided the resulting `roc.png` from the paper in the `paper/paper_results` directory.
167
-
168
176
### Possible Data Unavailability
169
177
170
178
The code shown in the paper downloads various datasets from standard resources.
@@ -175,18 +183,25 @@ In the unlikely but possible event that these datasets are not available anymore
175
183
176
184
in the Zenodo data.
177
185
178
-
If one of the downloads (1) fails, download the `cache.tar.gz` file from zenodo, untar it and put it in the location where the cache is (put the `R` folder of the cache into `/root/.cache/R` and the `torch` folder into `/root/.cache/torch` when using the docker images).
186
+
If one of the downloads (1) fails, do the following (before starting the docker container):
179
187
180
-
If (2) fails, download `dogs-vs-cats.tar.gz` from Zenodo, untar it and put it into the `paper/data` subdirectory where you are running the `paper_code.R` (so the directory structure is `paper/data/dogs-vs-cats`).
2. Unpack the file using `tar -xzf cache.tar.gz` which creates a folder named `cache`
193
+
3. Move this folder into the parent directory of `paper`
181
194
182
-
To do this in the Docker image you can, e.g., put the files into the parent directory of the `paper` directory (which will be mounted) and then after starting the container, copy the files into the correct location.
183
-
Assuming the unpacked cache files are in `/mnt/data/cache`, you can copy them into the correct location with:
195
+
After starting the docker container with the correct mount instructions (like shown earlier) run:
184
196
185
197
```bash
186
198
cp -r /mnt/data/cache/R/mlr3torch /root/.cache/R
187
199
cp -r /mnt/data/cache/torch /root/.cache/torch
188
200
```
189
201
202
+
If (2) fails, download `dogs-vs-cats.tar.gz` from Zenodo, untar it and put it into the `paper/data` subdirectory where you are running the `paper_code.R` (so the directory structure is `paper/data/dogs-vs-cats/`).
203
+
204
+
190
205
### Other errors
191
206
192
207
When reproducing the results with `knitr` in the docker container, we sometimes encountered issues with the weight downloads for the ResNet-18 model.
0 commit comments