-
Notifications
You must be signed in to change notification settings - Fork 92
Description
furrr works perfectly with future.batchtools. If you have a loop with 3 elements you get 3 jobs on the cluster:
library(furrr)
library(future.batchtools)
plan(batchtools_sge)
nothingness <- future_map(c(2, 2, 2), ~Sys.sleep(.x))
qstatjob-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
14375 0.50500 jobb856749 fred r 06/03/2023 22:23:03 [email protected] 1
14376 0.50500 job2b8fc0d fred r 06/03/2023 22:23:03 [email protected] 1
14377 0.50500 jobfdf28f1 fred r 06/03/2023 22:23:03 [email protected] 1
With foreach, however, I only get 1 job:
library(foreach)
library(future)
library(furrr)
library(future.batchtools)
library(doFuture)
mu <- 1.0
sigma <- 2.0
registerDoFuture()
plan(batchtools_sge)
x %<-% {
foreach(i = 1:3) %dopar% {
Sys.sleep(3)
set.seed(123)
rnorm(i, mean = mu, sd = sigma)
}
}qstatjob-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
14378 0.50500 job7472e4f fred r 06/03/2023 22:26:03 [email protected] 1
and when it return we get the following 2 strange warnings:
Warning messages: 1: executing %dopar% sequentially: no parallel backend registered 2: UNRELIABLE VALUE: Future ( ) unexpectedly generated random numbers without specifying argument 'seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore". >
Why does it say there is no parallel backend registered when I'm running registerDoFuture()?
Alternatively, I tried %dofuture% instead of %dopar% but it still only generates 1 job.
x %<-% {
foreach(i = 1:10) %dofuture% {
Sys.sleep(3)
set.seed(123)
rnorm(i, mean = mu, sd = sigma)
}
}
f = futureOf(x)
resolved(f)
x
qstatjob-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
14379 0.50500 job4b3cd22 fred r 06/03/2023 22:28:48 all.q@ipxxx 1
This time I only get the random number warning:
Warning message: UNRELIABLE VALUE: At least one of iterations 1-10 of the foreach() %dofuture% { … }, part of chunk #1 ( doFuture2-1 ), unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify foreach() argument '.options.future = list(seed = TRUE)'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, set option 'doFuture.rng.onMisuse' to "ignore". >
I also tried other options in foreach like .options.future = list(scheduling=1) but they don't seem to have any effect.
Is it possible that foreach somehow chunks all the iteration in one task? Or is something else not working.
Thanks
Fred