-
Notifications
You must be signed in to change notification settings - Fork 17
Description
I am using furrr to run some code in parallel in a HPC using mirai as the backend. I have an outer function that has an inner function that is run with future_map. I want to be able to run the outer function in parallel, which implies that the inner function is also run in parallel. A general idea of what I am doing is in the code below:
library(magrittr)
library(dplyr)
library(data.table)
library(mirai)
library(lme4)
library(furrr)
library(future.mirai)
outer_f <- function(df){
inner_f1 <- function(args, data){
}
test <- future_map(vector, inner_f1, df) # output is a dataframe
inner_f2 <- function(df){
glmer(test)
}
inner_f2(test)
}
# slurm configuration
slurm_config <- cluster_config(
command = "sbatch",
options = "#SBATCH --job-name=mirai
#SBATCH --mem=4G
#SBATCH --cpus-per-task=2
#SBATCH --ntasks=1
#SBATCH --time=100:00:00"
)
workers <- 12
#Get local ip
ip_info <- system("ip addr show ens192", intern = TRUE)
ipv4_line <- grep("inet ", ip_info)
local_ip <- gsub(".*inet (.*)/.*", "\\1", ip_info[ipv4_line])
print(local_ip)
#Generate random port whithin allowed
port <- floor(runif(1, min=60001, max=63000))
daemons(n = workers, url = paste0("tcp://",local_ip,":",port), remote = slurm_config)
plan(mirai_cluster)
results <- future_map(1:10,
~ safely(outer_f)(df),
.options = furrr_options(seed = 4445556)
)
So far, using cpus-per-task=2 seems to work in term of parallelising the inner function (based on tests I've done), but I am unsure how to properly define the nested nature of the paralellisation I am attempting, since there is only 1 declaration of daemons and I believe the way I am doing it might cause some memory issues if the future_map for the outer function is run thousands of times. Basically, I am wondering how inner and outer demos are to be declared when one uses Slurm if a nested parallelised map is to be run, specially considering that later on, I will need to implement a version of the code where inner_f2 is also parallelised with future_map.