Skip to content

Commit a7af33d

Browse files
committed
jss-4
1 parent abd19db commit a7af33d

File tree

17 files changed

+229
-1554
lines changed

17 files changed

+229
-1554
lines changed

paper/.Rprofile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# in enroot, we need the /mnt/data/paper heuristics, because .dockerenv does not exist
2+
if (!(file.exists("/.dockerenv") || file.exists("/mnt/data/paper"))) {
3+
source("renv/activate.R")
4+
}

paper/README.md

Lines changed: 41 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -5,39 +5,38 @@ Note that there is also a brief section on reproducibility in the appendix of th
55
## Computational Environment
66

77
In order to reproduce the results, you can either use the provided docker images or recreate the `renv` environment that is described in `paper/renv.lock`.
8-
To work with the renv environment, go into the `paper` directory, which will bootstrap the environment, and then run:
8+
To work with the renv environment, go into the `paper` directory and start an interactive R session, which will bootstrap the `renv` package.
9+
Then, restore the environment by running:
910

1011
```r
1112
renv::restore()
1213
```
1314

14-
Afterwards, you need to install torch:
15+
It will prompt ask you whether you want to proceed installing the missing packages, which you have to confirm.
16+
Afterwards, you need to install torch via:
1517

16-
```{r}
18+
```r
1719
torch::install_torch()
1820
```
1921

2022
We are providing two docker images, one for CPU and one for CUDA GPU that have the same packages from the `renv.lock` file installed.
21-
The images can be downloaded from Zenodo: https://doi.org/10.5281/zenodo.17130368.
22-
You can, for example, use the [zenodo_client](https://pypi.org/project/zenodo-client/) library to download the images:
23+
The images can be downloaded from Zenodo: [https://doi.org/10.5281/zenodo.17130368](https://doi.org/10.5281/zenodo.17130368), either via the web interface, or, for example, using `wget` or a similar tool:
2324

2425
```bash
25-
# pip install zenodo-client
26-
export ZENODO_API_TOKEN=<your-token>
27-
zenodo_client download 17130368 IMAGE_CPU.tar.gz
26+
# Docker images
27+
wget https://zenodo.org/records/18466801/files/IMAGE_CPU.tar.gz
28+
wget https://zenodo.org/records/18466801/files/IMAGE_GPU.tar.gz
2829
```
2930

30-
By default, the downloaded files are stored in `~/.data/zenodo`.
31-
3231
At the time of writing, the images are also hosted on dockerhub, but this is not a permanent storage:
33-
https://hub.docker.com/repository/docker/sebffischer/mlr3torch-jss/general
32+
[https://hub.docker.com/repository/docker/sebffischer/mlr3torch-jss/general](https://hub.docker.com/repository/docker/sebffischer/mlr3torch-jss/general)
3433

3534
The `Dockerfile`s used to create the images are available in the `./paper/envs` directory.
3635

37-
If you have downloaded the images like shown above, you can load them into Docker, e.g. via the command below (or otherwise adjust the path accordingly).
36+
When downloading the image from zenodo, you can register them with docker as follows:
3837

3938
```bash
40-
docker load -i ~/.data/zenodo/17130368/v1/IMAGE_CPU.tar.gz
39+
docker load -i /path/to/IMAGE_CPU.tar.gz
4140
```
4241

4342
To start the CPU docker container, run:
@@ -52,11 +51,11 @@ cd /mnt/data/paper
5251
The CUDA image can be started with the command below, which requires the [nvidia extension](https://docs.nvidia.com/ai-enterprise/deployment/vmware/latest/docker.html).
5352

5453
```bash
55-
docker run -it --gpus all --rm -v ../:/mnt/data sebffischer/mlr3torch-jss:gpu
54+
docker run -it --gpus all --rm -v <parent-dir-to-paper>:/mnt/data sebffischer/mlr3torch-jss:gpu
5655
cd /mnt/data/paper
5756
```
5857

59-
Note that the `.Rprofile` file ensures that when running R programs from the `paper` directory, the renv environment will be used unless the code is run in the docker container, where we are not relying on renv directly.
58+
Note that the `.Rprofile` file in `paper` ensures that when running R programs from the `paper` directory, the renv environment will be used unless the code is run in the docker container, where we are not relying on renv directly.
6059

6160
## Running the Benchmark
6261

@@ -82,7 +81,19 @@ Also note that it's important to have enough RAM, otherwise the benchmarks will
8281
However, there are many other factors, such as the exact hardware that make it generally difficult to reproduce the runtime results.
8382

8483
To run the benchmarks locally, ensure that you are in the `paper` directory.
85-
To run the GPU benchmarks (using the CUDA docker image) on linux, run:
84+
There are three scripts:
85+
86+
* `paper/benchmark/linux-gpu.R`, which creates the folder `paper/benchmark/registry-linux-gpu`
87+
* `paper/benchmark/linux-cpu.R`, which creates the folder `paper/benchmark/registry-linux-cpu`
88+
* `paper/benchmark/linux-gpu-optimizer.R`, which creates the folder `paper/benchmark/registry-linux-gpu-optimizer`
89+
90+
**Important**: If one of the folders already exists and you want to re-run the benchmarks, you need to delete or move the folder, otherwise you will get an error.
91+
This is to ensure that the benchmark results are not accidentally overwritten.
92+
93+
To run the benchmarks, either start it via Rscript or source it interactively.
94+
If you source it interactively and the registry folder already exists, it will ask you whether you want to delete it, which you have to confirm.
95+
96+
Below is the command for the GPU benchmark which needs to be run within the CUDA docker image.
8697

8798
```bash
8899
Rscript benchmark/linux-gpu.R
@@ -100,7 +111,7 @@ To run the benchmark that compares "ignite" with standard optimizers (using the
100111
Rscript benchmark/linux-gpu-optimizer.R
101112
```
102113

103-
The results are stored in:
114+
The postprocessd results are stored in:
104115

105116
* `paper/benchmark/result-linux-gpu.rds`
106117
* `paper/benchmark/result-linux-cpu.rds`
@@ -150,21 +161,18 @@ We provide the results of running this in `paper/paper_results`.
150161
The results in the paper are those from the CPU docker image and they were fully reproducible when we re-ran them on the same machine.
151162
There were some minor differences in results when re-running the code on a different machine (macOS with M1 CPU vs Linux with Intel CPU).
152163

153-
The file `paper_code.R` contains some very minor differences to the paper we omitted in the paper for brevity.
164+
The file `paper_code.R` contains some very minor differences to the paper, which we omitted in the paper for brevity.
154165
It was extracted from the tex manuscript almost fully programmatically but adjusted with the following modifications:
155166

156167
* Time measurements (`Sys.time()`)
157168
* Deactivate knitr caching
158169
* Activating caching for `mlr3torch`
159170
* Changing the `mlr3` logging level to `warn` for cleaner output
160-
* Saving the ROC plot for postprocessing
171+
* Processing the ROC plot for better readability and saving it as `roc.png`, as well as printing it.
161172
* Adding a `sessionInfo()` call at the end
162173

163174
We also added some additional comments to make it easier to associate the code with the paper.
164175

165-
The results we obtained via `knitr::spin()` are stored in `paper/paper_results/`
166-
The ROC plot is postprocessed using the `roc.R` script and we have also provided the resulting `roc.png` from the paper in the `paper/paper_results` directory.
167-
168176
### Possible Data Unavailability
169177

170178
The code shown in the paper downloads various datasets from standard resources.
@@ -175,18 +183,25 @@ In the unlikely but possible event that these datasets are not available anymore
175183

176184
in the Zenodo data.
177185

178-
If one of the downloads (1) fails, download the `cache.tar.gz` file from zenodo, untar it and put it in the location where the cache is (put the `R` folder of the cache into `/root/.cache/R` and the `torch` folder into `/root/.cache/torch` when using the docker images).
186+
If one of the downloads (1) fails, do the following (before starting the docker container):
179187

180-
If (2) fails, download `dogs-vs-cats.tar.gz` from Zenodo, untar it and put it into the `paper/data` subdirectory where you are running the `paper_code.R` (so the directory structure is `paper/data/dogs-vs-cats`).
188+
1. Download the `cache.tar.gz` file, e.g. via:
189+
```bash
190+
wget https://zenodo.org/records/18466801/files/cache.tar.gz
191+
```
192+
2. Unpack the file using `tar -xzf cache.tar.gz` which creates a folder named `cache`
193+
3. Move this folder into the parent directory of `paper`
181194

182-
To do this in the Docker image you can, e.g., put the files into the parent directory of the `paper` directory (which will be mounted) and then after starting the container, copy the files into the correct location.
183-
Assuming the unpacked cache files are in `/mnt/data/cache`, you can copy them into the correct location with:
195+
After starting the docker container with the correct mount instructions (like shown earlier) run:
184196

185197
```bash
186198
cp -r /mnt/data/cache/R/mlr3torch /root/.cache/R
187199
cp -r /mnt/data/cache/torch /root/.cache/torch
188200
```
189201

202+
If (2) fails, download `dogs-vs-cats.tar.gz` from Zenodo, untar it and put it into the `paper/data` subdirectory where you are running the `paper_code.R` (so the directory structure is `paper/data/dogs-vs-cats/`).
203+
204+
190205
### Other errors
191206

192207
When reproducing the results with `knitr` in the docker container, we sometimes encountered issues with the weight downloads for the ResNet-18 model.

paper/benchmark/benchmark.R

Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,28 @@ library(batchtools)
22
library(mlr3misc)
33

44
setup = function(reg_path, python_path, work_dir) {
5+
6+
print_setup_info(reg_path, python_path, work_dir)
7+
8+
9+
if (file.exists(reg_path)) {
10+
msg <- sprintf("Registry already exists at path %s. Delete the folder it to run the benchmark again.", reg_path)
11+
if (!interactive()) {
12+
stop(msg)
13+
}
14+
answer <- readline(sprintf("Registry already exists at path %s. Delete it to run the benchmark again? (y/n)", reg_path))
15+
if (answer == "y") {
16+
unlink(reg_path, recursive = TRUE)
17+
} else {
18+
stop(msg)
19+
}
20+
}
21+
522
reg = makeExperimentRegistry(
623
file.dir = reg_path,
724
work.dir = work_dir,
8-
packages = "checkmate"
25+
packages = "checkmate",
26+
seed = 123
927
)
1028
reg$cluster.functions = makeClusterFunctionsInteractive()
1129

@@ -48,6 +66,7 @@ setup = function(reg_path, python_path, work_dir) {
4866
)
4967

5068
addAlgorithm("pytorch", fun = function(instance, job, data, jit, ...) {
69+
print(instance)
5170
f = function(..., python_path) {
5271
library(reticulate)
5372
x = try(
@@ -68,6 +87,7 @@ setup = function(reg_path, python_path, work_dir) {
6887
})
6988

7089
addAlgorithm("rtorch", fun = function(instance, job, opt_type, jit, ...) {
90+
print(instance)
7191
assert_choice(opt_type, c("standard", "ignite"))
7292
if (opt_type == "ignite") {
7393
instance$optimizer = paste0("ignite_", instance$optimizer)
@@ -77,6 +97,7 @@ setup = function(reg_path, python_path, work_dir) {
7797
})
7898

7999
addAlgorithm("mlr3torch", fun = function(instance, job, opt_type, jit, ...) {
100+
print(instance)
80101
if (opt_type == "ignite") {
81102
instance$optimizer = paste0("ignite_", instance$optimizer)
82103
}
@@ -93,3 +114,36 @@ REPLS = 10L
93114
EPOCHS = 20L
94115
N = 2000L
95116
P = 1000L
117+
118+
print_setup_info = function(reg_path, python_path, work_dir) {
119+
cat("Session Info:\n")
120+
print(sessionInfo())
121+
cat("Library Paths:\n")
122+
for (path in .libPaths()) {
123+
cat(" -", path, "\n")
124+
}
125+
cat("Working Directory:", getwd(), "\n")
126+
127+
cat("Subfolders of working directory:\n")
128+
for (folder in list.files(work_dir)) {
129+
cat(" -", folder, "\n")
130+
}
131+
132+
# Function arguments
133+
cat("--- FUNCTION ARGUMENTS ---\n")
134+
cat(" Registry Path:", reg_path, "\n")
135+
cat(" Python Path:", python_path, " (", if (file.exists(python_path)) "exists" else "does not exist", ")\n")
136+
cat(" Work Directory:", work_dir, "\n\n")
137+
cat("Cuda is available:", torch::cuda_is_available(), "\n")
138+
out <- try(callr::r(function(python_path) {
139+
reticulate::use_python(python_path, required = TRUE)
140+
return(reticulate::py_config())
141+
}, show = TRUE, args = list(python_path = python_path)), silent = TRUE)
142+
if (inherits(out, "try-error")) {
143+
cat("Error occurred while calling Python:\n")
144+
print(out)
145+
} else {
146+
cat("Python configuration:\n")
147+
print(out)
148+
}
149+
}

paper/benchmark/linux-cpu.R

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,12 @@
11
library(here)
22

3+
set.seed(42)
34
source(here("benchmark", "benchmark.R"))
45

56
# Change this when not running this in the docker image
67
# Below is the correct python path for the CPU docker image.
78
PYTHON_PATH = "/opt/venv/bin/python"
89

9-
if (dir.exists(here("benchmark", "registry-linux-gpu"))) {
10-
stop("Registry already exists. Delete it to run the benchmark again.")
11-
}
12-
1310
setup(
1411
here("benchmark", "registry-linux-cpu"),
1512
PYTHON_PATH,

paper/benchmark/linux-gpu-optimizer.R

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,16 @@ library(here)
22

33
source(here("benchmark", "benchmark.R"))
44

5+
set.seed(43)
6+
57
# Change this when not running this in the docker image
68
# Below is the correct python path for the CUDA docker image
79
PYTHON_PATH = "/usr/bin/python3"
810

11+
if (!torch::cuda_is_available()) {
12+
stop("Cuda is not available for R-torch, please use the correct docker image.")
13+
}
14+
915
problem_design = expand.grid(
1016
list(
1117
n = N,
@@ -20,9 +26,6 @@ problem_design = expand.grid(
2026
stringsAsFactors = FALSE
2127
)
2228

23-
if (dir.exists(here("benchmark", "registry-linux-gpu-optimizer"))) {
24-
stop("Registry already exists. Delete it to run the benchmark again.")
25-
}
2629

2730
setup(
2831
here("benchmark", "registry-linux-gpu-optimizer"),

paper/benchmark/linux-gpu.R

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,16 @@ library(here)
22

33
source(here("benchmark", "benchmark.R"))
44

5+
set.seed(44)
6+
57
# Change this when not running this in the docker image
68
# Below is the correct python path for the CUDA docker image
79
PYTHON_PATH = "/usr/bin/python3"
810

11+
if (!torch::cuda_is_available()) {
12+
stop("Cuda is not available for R-torch, please use the correct docker image.")
13+
}
14+
915
problem_design = expand.grid(
1016
list(
1117
n = N,
@@ -20,10 +26,6 @@ problem_design = expand.grid(
2026
stringsAsFactors = FALSE
2127
)
2228

23-
if (dir.exists(here("benchmark", "registry-linux-gpu"))) {
24-
stop("Registry already exists. Delete it to run the benchmark again.")
25-
}
26-
2729
setup(
2830
here("benchmark", "registry-linux-gpu"),
2931
PYTHON_PATH,

paper/benchmark/time_rtorch.R

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,6 @@
11
time_rtorch = function(epochs, batch_size, n_layers, latent, n, p, device, jit, seed, optimizer, mlr3torch = FALSE) {
22
library(mlr3torch)
33
library(torch)
4-
mlr3pipelines::po
5-
mlr3torch::LearnerTorch
6-
mlr3::lrn
74
torch_set_num_threads(1)
85
torch_manual_seed(seed)
96

paper/extract.R

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,16 @@ code_lines <- c(
6868
"options(mlr3torch.cache = TRUE)",
6969
"lgr::get_logger(\"mlr3\")$set_threshold(\"warn\")",
7070
code_lines,
71-
"saveRDS(plt, \"roc.rds\")",
71+
"library(\"ggplot2\")",
72+
"plt = plt +",
73+
" theme(",
74+
" axis.text.x = element_text(size = 12),",
75+
" axis.text.y = element_text(size = 12),",
76+
" axis.title.x = element_text(size = 12),",
77+
" axis.title.y = element_text(size = 12)",
78+
" )",
79+
"print(plt)",
80+
"ggsave(here::here(\"roc.png\"), plt, width = 4, height = 4, dpi = 300)",
7281
"Sys.time()",
7382
"sessionInfo()"
7483
)

paper/paper_code.R

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,7 @@
55
# Some setup code
66
Sys.time()
77
options(mlr3torch.cache = TRUE)
8-
lgr::get_logger('mlr3')$set_threshold('warn')
9-
10-
# 2.2 Main dependencies
11-
12-
# mlr3
8+
lgr::get_logger("mlr3")$set_threshold("warn")
139
library("mlr3")
1410
set.seed(42)
1511
task <- tsk("mtcars")
@@ -367,7 +363,15 @@ task_subset$filter(subset)
367363
rr <- resample(task_subset, glrn, rsmp("holdout"))
368364
plt <- autoplot(rr, type = "roc")
369365

370-
# Save plot so it can be modified later
371-
saveRDS(plt, "roc.rds")
366+
library("ggplot2")
367+
plt = plt +
368+
theme(
369+
axis.text.x = element_text(size = 12),
370+
axis.text.y = element_text(size = 12),
371+
axis.title.x = element_text(size = 12),
372+
axis.title.y = element_text(size = 12)
373+
)
374+
print(plt)
375+
ggsave(here::here("roc.png"), plt, width = 4, height = 4, dpi = 300)
372376
Sys.time()
373377
sessionInfo()

0 commit comments

Comments
 (0)