RAM keep increasing until crash when run many brms/Stan models in parallel based on futures #643

Yunrui-Liu · 2022-09-07T14:58:05Z

Yunrui-Liu
Sep 7, 2022

@HenrikBengtsson
In my project, I plan to run thousands of bayesian multilevel model using brm package, which means it’s time-consuming. So I try to use furrr package to run these models in parallel. Everything is good, but the RAM keep increasing until R crash. As shown in the attached figure, the memory usage ramp up over the night from Friday to Saturday, then max out at ~250 GB (insane amount of RAM) and crashed.

I already used rm(fit) and gc() command after I save the model results. I also tried use plan(callr, workers = parallel::detectCores()-1) instead of plan(multisession, workers = parallel::detectCores()-1). But it does not make a big difference.

I am wondering do you have any experience about run brms in parallel efficiently and quickly, and how to solve the memory problem.

Here is details of code:
rm(list=ls())

#Load Packages
library(tidyverse)
library(dplyr)
library(rstan)
library(lme4)
library(brms)
library(tidybayes)
library(parallel)
library(furrr)

#Set working directory
setwd(“…/data”)
data_path ← “…/Analysis”

#Load data

#Create index
index ← crossing(
Event = c(unique(nested.psw_small$Event),“none”), #Event_length=15
Trait = unique(bfi_wide$Trait), # Trait_length=5
RE = c(“int”, “intslope”), # RE_length=2
Match = c(“matched”, “unmatched”)) %>% # match_length = 2
filter(!(RE == “int” & Event == “none”) &
!(Match == “matched” & Event == “none”) &
!(RE == “int” & Match == “unmatched”))

#Create multilevel model function
growth_fun ← function(event, trait, re, match){

lapply(1:10, function(x){

rstan_options(auto_write = TRUE)

set prior

Prior <- c(set_prior("cauchy(0,1)", class = "sd"), set_prior("cauchy(0,1)", class = "sigma"))

get formula

if(re == "intslope" & event != "none"){
f <- value ~ new.wavele.group + (new.wave|PROC_SID)
}else if (re == "int" & match == "matched"){
f <- value ~ new.wavele.group + (1|PROC_SID)
}else{
f <- value ~ new.wave + (new.wave|PROC_SID)
}

set specifications for models with convergence issues

if((trait == "A" & event %in% c("ChldMvOut", "MoveIn", "ChldBrth")) |
(trait == "E" & event %in% c("ParDied")) |
(trait == "C" & event %in% c("Retire")) |
(trait == "O" & event %in% c("Unemploy")) |
event %in% c("PartDied", "FrstJob", "Divorce", "LeftPar")){
Iter <- 8000; Warmup <- 4000; treedepth <- 20
} else {Iter <- 2000; Warmup <- 1000; treedepth <- 10}

run models

fit <- brm(formula = f, data = df, prior = Prior, iter = Iter, warmup = Warmup,
control = list(adapt_delta = 0.99, max_treedepth = treedepth))

file <- sprintf("%s/data/Results_data/Test/%s_%s_%s_chain%s_%s.RData", data_path, trait, event, match, x, re)
save(fit, file = file)
rm(fit)
gc()
})
}

#Run multilevel model
plan(multisession, workers = parallel::detectCores()-1, gc = TRUE)

index %>%
mutate(mod = future_pmap(list(Event, Trait, RE, Match), growth_fun))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RAM keep increasing until crash when run many brms/Stan models in parallel based on futures #643

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

RAM keep increasing until crash when run many brms/Stan models in parallel based on futures #643

Uh oh!

Yunrui-Liu Sep 7, 2022

set prior

get formula

set specifications for models with convergence issues

run models

Replies: 0 comments

Yunrui-Liu
Sep 7, 2022