Open
Description
For now, I'd take the current ODE Bayesian Parameter estimation benchmarks offline and stop quoting them as up-to date evidence that Julia is currently 3-5 times faster than Stan for ODE problems. If I knew Julia better I'd fix them, but I don't, so here are the issues with them that I currently know:
- The Lorenz benchmark fails to sample from the correct posterior for either library
- The Lorenz benchmark uses 500+1000 warmup+samples for Turing, but 1000+1000 for Stan
- The Lotka-Volterra benchmark samples from two different posteriors (
parameters = sigma1.1, sigma1.2, theta_1, theta_2, theta_3, theta_4
for Stan andparameters = theta[1], theta[2], theta[3], theta[4], σ[1]
for Turing) - Neither benchmark makes it clear what Stan's actual sampling time is, because the time reported through @Btime includes compilation
- DynamicHMC only reports runtime and nothing else that is relevant
- AFAIK, in the benchmarks Stan always uses
adapt_delta=.85
(its default) while Turing uses "adapt_delta=.65
", which potentially lowers runtime but also ESS - Runtime isn't a good metric anyways, I would probably display per parameter ESS/s histograms/boxplots over several runs with less (post warm-up) samples
- The interesting problems with ODE models tend to only come up with non-toy problems, but they also tend to take much longer to run, making it difficult (I assume) to include them in automatic benchmarks
Metadata
Metadata
Assignees
Labels
No labels