Skip to content

Enable caching of compiled Stan functions #870

@wds15

Description

@wds15

I am using Stan defined functions extensively in my R workflow and as such I want to cache the compiled C++ codes in order to avoid any re-compilation whenever R starts again. I would only want the functions to be recompiled if they actually change.

Here is a possible implementation, which does what I want - and it works for me on Linux and macOS just fine. The code uses an internal cmdstanr function and it would thus be great to support this behaviour directly in cmdstanr.

This approach lifts the current restriction to expose functions from models with pre-compiled model binaries in addition. This would be very helpful for brms models as well, since for these the use of pre-compiled models (enabled via the use of mdstanr_write_stan_file_dir) is very convenient, but expose_functions does not work at the moment in this case.

suppressPackageStartupMessages(library(cmdstanr))
suppressPackageStartupMessages(library(here))

## utility function works with cmdstanr 0.6.0 which exports via
## cmdstanr (and Stan 2.32 at leat) the functions from Stan to R
cmdstan_expose_functions <- function(...) {
    pseudo_model_code <- paste(c("functions {", ..., "}"), collapse="\n")
    functions_hash <- rlang::hash(pseudo_model_code)
    model_name <- paste0("model-functions-", functions_hash)
    ## note: cmdstanr somehow only compiles standalone functions
    ## whenever one is compiling the model (and not allowing to export
    ## the functions if one is not compiling it). This is why
    ## force_compile=TRUE is a save option
    ##pseudo_model <- cmdstanr::cmdstan_model(cmdstanr::write_stan_file(pseudo_model_code), compile_standalone=TRUE, force_compile=TRUE, stanc_options=list(name=paste0("model-functions-", functions_hash)))
    ##pseudo_model$functions
    ## but things seem to work ok if we abuse a bit the internals... tested with cmdstanr 0.6.1
    ## note that we have to set the model name manually to a
    ## determinstic string (depending only on the stan functions being
    ## compiled)
    stan_file <- cmdstanr::write_stan_file(pseudo_model_code)
    pseudo_model <- cmdstanr::cmdstan_model(stan_file, stanc_options=list(name=model_name))
    pseudo_model$functions$existing_exe <- FALSE
    pseudo_model$functions$external <- FALSE
    stancflags_standalone <- c("--standalone-functions", paste0("--name=", model_name))
    pseudo_model$functions$hpp_code <- cmdstanr:::get_standalone_hpp(stan_file, stancflags_standalone)
    pseudo_model$expose_functions(FALSE, FALSE) ## will return the functions in an environment
    pseudo_model$functions
}

example Stan function

stan_function <- "
real heavy_work(real m) {
   return m * m;
}
"

we want to cache the binaries need to created the Stan functions in the R session.
<U+00A0>For this to work we need to

  • get cmdstanr to write Stan files always to the same persistent directory
  • setup Rcpp to also use a caching directory persistently

cmdstanr and Rcpp use by default caching directories in the
temporary R directory, which gets wiped out everytime we restart R

options(cmdstanr_write_stan_file_dir=here("brms-cache"))

create cache directory if not yet available

dir.create(here("brms-cache"), FALSE)

cache exposed Stan functions

options(rcpp.cache.dir=here("rcpp-cache"))
dir.create(here("rcpp-cache"), FALSE)

compile functions upon the first time

system.time(funs <- cmdstan_expose_functions(stan_function))
#> ld: warning: duplicate -rpath '/Users/weberse2/.cmdstanr/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb' ignored
#> Compiling standalone functions...
#>    user  system elapsed 
#>   6.459   1.134   8.958

do it again<U+2026> now this is a lot faster as now more compilation takes place

system.time(funs_cached <- cmdstan_expose_functions(stan_function))
#> Compiling standalone functions...
#>    user  system elapsed 
#>   0.295   0.308   0.930

ls(funs)
#> [1] "compiled"     "existing_exe" "external"     "fun_names"    "heavy_work"  
#> [6] "hpp_code"
ls(funs_cached)
#> [1] "compiled"     "existing_exe" "external"     "fun_names"    "heavy_work"  
#> [6] "hpp_code"


funs$heavy_work(10)
#> [1] 100
funs_cached$heavy_work(10)
#> [1] 100

Session Info

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS 13.6.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] C
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] here_1.0.1     cmdstanr_0.6.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.11          pillar_1.8.1         compiler_4.1.0      
#>  [4] highr_0.9            tools_4.1.0          digest_0.6.30       
#>  [7] lattice_0.20-45      evaluate_0.17        lifecycle_1.0.3     
#> [10] tibble_3.1.8         checkmate_2.1.0      gtable_0.3.1        
#> [13] pkgconfig_2.0.3      rlang_1.1.1          Matrix_1.3-3        
#> [16] reprex_2.0.2         DBI_1.1.3            cli_3.4.1           
#> [19] yaml_2.3.6           xfun_0.37            fastmap_1.1.0       
#> [22] dplyr_1.0.10         withr_2.5.0          styler_1.5.1        
#> [25] stringr_1.4.1        knitr_1.40           generics_0.1.3      
#> [28] fs_1.5.2             vctrs_0.5.0          rprojroot_2.0.3     
#> [31] tidyselect_1.2.0     grid_4.1.0           glue_1.6.2          
#> [34] R6_2.5.1             processx_3.7.0       fansi_1.0.3         
#> [37] distributional_0.3.2 rmarkdown_2.20       tensorA_0.36.2      
#> [40] purrr_0.3.5          farver_2.1.1         ggplot2_3.4.2       
#> [43] posterior_1.4.1      magrittr_2.0.3       ps_1.7.1            
#> [46] backports_1.4.1      scales_1.2.1         htmltools_0.5.3     
#> [49] assertthat_0.2.1     abind_1.4-5          colorspace_2.0-3    
#> [52] utf8_1.2.2           stringi_1.7.8        munsell_0.5.0       
#> [55] RcppEigen_0.3.3.9.3

Created on 2023-11-02 with [reprex v2.0.2](https://reprex.tidyve

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions