Hi Turing team! I've recently developed a package called ParallelMCMC.jl. Its entire goal is to implement novel algorithms that parallelize MCMC across the sequence. While MCMC is traditionally viewed as a sequential algorithm--this recent paper showed that by using another algorithm called DEER, one can generate tens to hundreds of thousands of MCMC samples orders of magnitudes faster than sequential versions of the same algorithms.
I am just about completed implementing the Parallel MALA sampler from the paper and am seeing promising results already (e.g., 25 ms to sample 10k samples from a moderately sized bayesian logistic regression problem). I expect to have this package registered soon after some more tweaks/TLC (extra eyes would be appreciated if anyone is interested) but, I wanted to gauge interest in integrating this into the ecosystem? As a note, it requires a GPU to make this approach practically useful.
Hi Turing team! I've recently developed a package called ParallelMCMC.jl. Its entire goal is to implement novel algorithms that parallelize MCMC across the sequence. While MCMC is traditionally viewed as a sequential algorithm--this recent paper showed that by using another algorithm called DEER, one can generate tens to hundreds of thousands of MCMC samples orders of magnitudes faster than sequential versions of the same algorithms.
I am just about completed implementing the Parallel MALA sampler from the paper and am seeing promising results already (e.g., 25 ms to sample 10k samples from a moderately sized bayesian logistic regression problem). I expect to have this package registered soon after some more tweaks/TLC (extra eyes would be appreciated if anyone is interested) but, I wanted to gauge interest in integrating this into the ecosystem? As a note, it requires a GPU to make this approach practically useful.