-
-
Notifications
You must be signed in to change notification settings - Fork 66
Description
I am using Stan defined functions extensively in my R workflow and as such I want to cache the compiled C++ codes in order to avoid any re-compilation whenever R starts again. I would only want the functions to be recompiled if they actually change.
Here is a possible implementation, which does what I want - and it works for me on Linux and macOS just fine. The code uses an internal cmdstanr function and it would thus be great to support this behaviour directly in cmdstanr.
This approach lifts the current restriction to expose functions from models with pre-compiled model binaries in addition. This would be very helpful for brms models as well, since for these the use of pre-compiled models (enabled via the use of mdstanr_write_stan_file_dir) is very convenient, but expose_functions does not work at the moment in this case.
suppressPackageStartupMessages(library(cmdstanr))
suppressPackageStartupMessages(library(here))
## utility function works with cmdstanr 0.6.0 which exports via
## cmdstanr (and Stan 2.32 at leat) the functions from Stan to R
cmdstan_expose_functions <- function(...) {
pseudo_model_code <- paste(c("functions {", ..., "}"), collapse="\n")
functions_hash <- rlang::hash(pseudo_model_code)
model_name <- paste0("model-functions-", functions_hash)
## note: cmdstanr somehow only compiles standalone functions
## whenever one is compiling the model (and not allowing to export
## the functions if one is not compiling it). This is why
## force_compile=TRUE is a save option
##pseudo_model <- cmdstanr::cmdstan_model(cmdstanr::write_stan_file(pseudo_model_code), compile_standalone=TRUE, force_compile=TRUE, stanc_options=list(name=paste0("model-functions-", functions_hash)))
##pseudo_model$functions
## but things seem to work ok if we abuse a bit the internals... tested with cmdstanr 0.6.1
## note that we have to set the model name manually to a
## determinstic string (depending only on the stan functions being
## compiled)
stan_file <- cmdstanr::write_stan_file(pseudo_model_code)
pseudo_model <- cmdstanr::cmdstan_model(stan_file, stanc_options=list(name=model_name))
pseudo_model$functions$existing_exe <- FALSE
pseudo_model$functions$external <- FALSE
stancflags_standalone <- c("--standalone-functions", paste0("--name=", model_name))
pseudo_model$functions$hpp_code <- cmdstanr:::get_standalone_hpp(stan_file, stancflags_standalone)
pseudo_model$expose_functions(FALSE, FALSE) ## will return the functions in an environment
pseudo_model$functions
}example Stan function
stan_function <- "
real heavy_work(real m) {
return m * m;
}
"we want to cache the binaries need to created the Stan functions in the R session.
<U+00A0>For this to work we need to
- get cmdstanr to write Stan files always to the same persistent directory
- setup Rcpp to also use a caching directory persistently
cmdstanr and Rcpp use by default caching directories in the
temporary R directory, which gets wiped out everytime we restart R
options(cmdstanr_write_stan_file_dir=here("brms-cache"))create cache directory if not yet available
dir.create(here("brms-cache"), FALSE)cache exposed Stan functions
options(rcpp.cache.dir=here("rcpp-cache"))
dir.create(here("rcpp-cache"), FALSE)compile functions upon the first time
system.time(funs <- cmdstan_expose_functions(stan_function))
#> ld: warning: duplicate -rpath '/Users/weberse2/.cmdstanr/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb' ignored
#> Compiling standalone functions...
#> user system elapsed
#> 6.459 1.134 8.958do it again<U+2026> now this is a lot faster as now more compilation takes place
system.time(funs_cached <- cmdstan_expose_functions(stan_function))
#> Compiling standalone functions...
#> user system elapsed
#> 0.295 0.308 0.930
ls(funs)
#> [1] "compiled" "existing_exe" "external" "fun_names" "heavy_work"
#> [6] "hpp_code"
ls(funs_cached)
#> [1] "compiled" "existing_exe" "external" "fun_names" "heavy_work"
#> [6] "hpp_code"
funs$heavy_work(10)
#> [1] 100
funs_cached$heavy_work(10)
#> [1] 100Session Info
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS 13.6.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] here_1.0.1 cmdstanr_0.6.1
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.11 pillar_1.8.1 compiler_4.1.0
#> [4] highr_0.9 tools_4.1.0 digest_0.6.30
#> [7] lattice_0.20-45 evaluate_0.17 lifecycle_1.0.3
#> [10] tibble_3.1.8 checkmate_2.1.0 gtable_0.3.1
#> [13] pkgconfig_2.0.3 rlang_1.1.1 Matrix_1.3-3
#> [16] reprex_2.0.2 DBI_1.1.3 cli_3.4.1
#> [19] yaml_2.3.6 xfun_0.37 fastmap_1.1.0
#> [22] dplyr_1.0.10 withr_2.5.0 styler_1.5.1
#> [25] stringr_1.4.1 knitr_1.40 generics_0.1.3
#> [28] fs_1.5.2 vctrs_0.5.0 rprojroot_2.0.3
#> [31] tidyselect_1.2.0 grid_4.1.0 glue_1.6.2
#> [34] R6_2.5.1 processx_3.7.0 fansi_1.0.3
#> [37] distributional_0.3.2 rmarkdown_2.20 tensorA_0.36.2
#> [40] purrr_0.3.5 farver_2.1.1 ggplot2_3.4.2
#> [43] posterior_1.4.1 magrittr_2.0.3 ps_1.7.1
#> [46] backports_1.4.1 scales_1.2.1 htmltools_0.5.3
#> [49] assertthat_0.2.1 abind_1.4-5 colorspace_2.0-3
#> [52] utf8_1.2.2 stringi_1.7.8 munsell_0.5.0
#> [55] RcppEigen_0.3.3.9.3Created on 2023-11-02 with [reprex v2.0.2](https://reprex.tidyve