Skip to content

Multiple inits for multi-path Pathfinder? #907

Open
@colinorourke

Description

@colinorourke

Alluded to in #874, I've noticed that if you try to set initial values to the Pathfinder algorithm, in particular the multi-path variety of this algorithm, cmdstanr currently doesn't allow the paths to begin from different initial positions. The requirement for the Pathfinder algorithm seems to be that the inits are a list containing just a single initial value. While this makes sense for the other variational algorithms, for Pathfinder this means that each L-BFGS starts from the same place, which doesn't seem ideal. Maybe this is due to some underlying restriction of cmdstan itself?

Here's an example (I've simplified some of the output to remove warnings from Stan about gradients & Pareto k values):

# Load necessary packages
library(cmdstanr)
#> This is cmdstanr version 0.7.1
#> - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#> - CmdStan path: C:/Users/corour/Documents/.cmdstan/cmdstan-2.34.0
#> - CmdStan version: 2.34.0
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.20.4). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').

# Setup a model. This is just an example from brms.
prior1 = prior(normal(0, 10), class = b) +
  prior(cauchy(0, 2), class = sd)
stan_code = make_stancode(
  count ~ zAge + zBase * Trt + (1 | patient),
  data = epilepsy,
  family = poisson(),
  prior = prior1
)
stan_data = make_standata(
  count ~ zAge + zBase * Trt + (1 | patient),
  data = epilepsy,
  family = poisson(),
  prior = prior1
)
mod = cmdstan_model(write_stan_file(stan_code))

# Create a function that generates some initial values. It will generate
# different sets of initial values depending on the value of `chain_id`
init_fn = function(chain_id) {
  old_rng_state = get(".Random.seed", envir = globalenv())
  on.exit(assign(".Random.seed", value = old_rng_state, envir = globalenv()))
  new_seed = rngtools::RNGseq(n = chain_id + 1L, seed = 1313L)[[chain_id]]
  assign(".Random.seed", value = new_seed, envir = globalenv())
  b = runif(4L,-2, 2) #vector[Kc]
  Intercept = runif(1L,-2, 2)
  sd_1 = runif(1L, 0, 2) #vector<lower=0>[M_1]
  z_1 = array(runif(1 * 59,-2, 2), dim = c(1, 59)) #array[M_1] vector[N_1]
  list(
    b = b,
    Intercept = Intercept,
    sd_1 = sd_1,
    z_1 = z_1
  )
}

# Create initial values for cmdstanr in the "list-of-lists" format
init_list_4 = lapply(1:4, \(i) init_fn(i))
init_list_1 = list(init_fn(1))

# Try to run Pathfinder with num_paths=4 using the "list-of-lists" style
# of setting initial values. This just produces an error.
mod_w_init_list = mod$pathfinder(
  data = stan_data,
  seed = 123,
  init = init_list_4,
  num_paths = 4,
  refresh = 1000,
  history_size = 10
)
#> Error: 'init' has the wrong length. See documentation of 'init' argument.

# Run Pathfinder with num_paths=4 but using just a list with a single
# initialization value. This runs but re-uses the same init for each path
mod_w_init_list = mod$pathfinder(
  data = stan_data,
  seed = 123,
  init = init_list_1,
  num_paths = 4,
  refresh = 1000,
  history_size = 10
)
#> Path [1] :Initial log joint density = -16938.506748 
#> Path [1] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.110e+03 -1.110e+03                   
#> Path [1] :Best Iter: [76] ELBO (-785.130279) evaluations: (4851) 
#> Path [2] :Initial log joint density = -16938.506748 
#> Path [2] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.135e+03 -1.135e+03                   
#> Path [2] :Best Iter: [77] ELBO (-789.101802) evaluations: (4851) 
#> Path [3] :Initial log joint density = -16938.506748 
#> Path [3] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.245e+03 -1.245e+03                   
#> Path [3] :Best Iter: [76] ELBO (-787.813011) evaluations: (4851) 
#> Path [4] :Initial log joint density = -16938.506748 
#> Path [4] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.116e+03 -1.116e+03                   
#> Path [4] :Best Iter: [76] ELBO (-788.891651) evaluations: (4851) 
#> Total log probability function evaluations:23304 
#> Finished in  1.8 seconds.

# Run using the function-style of specifying inits. This seems to just repeat
# the single set of inits for for each of the 4 paths.
mod_w_inits = mod$pathfinder(
  data = stan_data,
  seed = 123,
  init = init_fn,
  num_paths = 4,
  refresh = 1000,
  history_size = 10
)
#> Path [1] :Initial log joint density = -16938.506748
#> Path [1] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.110e+03 -1.110e+03                   
#> Path [1] :Best Iter: [76] ELBO (-785.130279) evaluations: (4851) 
#> Path [2] :Initial log joint density = -16938.506748
#> Path [2] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.135e+03 -1.135e+03                   
#> Path [2] :Best Iter: [77] ELBO (-789.101802) evaluations: (4851) 
#> Path [3] :Initial log joint density = -16938.506748 
#> Path [3] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.245e+03 -1.245e+03                   
#> Path [3] :Best Iter: [76] ELBO (-787.813011) evaluations: (4851) 
#> Path [4] :Initial log joint density = -16938.506748 
#> Path [4] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             194      -6.539e+02      7.973e-05   5.342e-02    1.000e+00  1.000e+00      4851 -1.116e+03 -1.116e+03                   
#> Path [4] :Best Iter: [76] ELBO (-788.891651) evaluations: (4851) 
#> Total log probability function evaluations:23304 
#> Finished in  1.8 seconds.

# Run using the default random inits. Different inits are used for each
# Pathfinder path.
mod_wo_inits = mod$pathfinder(
  data = stan_data,
  seed = 123,
  num_paths = 4,
  refresh = 1000,
  history_size = 10
)
#> Path [1] :Initial log joint density = -5218.225137 
#> Path [1] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             399      -6.539e+02      8.264e-05   4.839e-02    9.763e-01  9.763e-01      9976 -1.649e+05 -1.649e+05                   
#> Path [1] :Best Iter: [54] ELBO (-735.976769) evaluations: (9976) 
#> Path [2] :Initial log joint density = -6622.413033 
#> Path [2] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             308      -6.539e+02      8.655e-05   7.312e-02    1.000e+00  1.000e+00      7701 -1.418e+10 -1.418e+10                   
#> Path [2] :Best Iter: [50] ELBO (-745.757922) evaluations: (7701) 
#> Path [3] :Initial log joint density = -12283.643776 
#> Path [3] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             310      -6.539e+02      1.200e-04   1.092e-01    8.474e-01  8.474e-01      7751 -6.558e+04 -6.558e+04                   
#> Path [3] :Best Iter: [58] ELBO (-743.025505) evaluations: (7751) 
#> Path [4] :Initial log joint density = -95375.955585 
#> Path [4] : Iter      log prob        ||dx||      ||grad||     alpha      alpha0      # evals       ELBO    Best ELBO        Notes  
#>             193      -6.539e+02      3.622e-04   8.454e-02    1.000e+00  1.000e+00      4826 -2.231e+03 -2.231e+03                   
#> Path [4] :Best Iter: [82] ELBO (-786.378632) evaluations: (4826) 
#> Total log probability function evaluations:34154 
#> Finished in  2.6 seconds.

Created on 2024-01-22 with reprex v2.0.2

R version info:

 setting  value
 version  R version 4.3.2 (2023-10-31 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RStudio
 language (EN)
 collate  English_United States.utf8
 ctype    English_United States.utf8
 tz       America/Los_Angeles
 date     2024-01-22
 rstudio  2023.06.1+524 Mountain Hydrangea (desktop)
 pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions