Skip to content

Summary of issues with ODE Bayesian parameter estimation benchmarks #496

Open
@nsiccha

Description

@nsiccha

For now, I'd take the current ODE Bayesian Parameter estimation benchmarks offline and stop quoting them as up-to date evidence that Julia is currently 3-5 times faster than Stan for ODE problems. If I knew Julia better I'd fix them, but I don't, so here are the issues with them that I currently know:

  • The Lorenz benchmark fails to sample from the correct posterior for either library
  • The Lorenz benchmark uses 500+1000 warmup+samples for Turing, but 1000+1000 for Stan
  • The Lotka-Volterra benchmark samples from two different posteriors (parameters = sigma1.1, sigma1.2, theta_1, theta_2, theta_3, theta_4 for Stan and parameters = theta[1], theta[2], theta[3], theta[4], σ[1] for Turing)
  • Neither benchmark makes it clear what Stan's actual sampling time is, because the time reported through @Btime includes compilation
  • DynamicHMC only reports runtime and nothing else that is relevant
  • AFAIK, in the benchmarks Stan always uses adapt_delta=.85 (its default) while Turing uses "adapt_delta=.65", which potentially lowers runtime but also ESS
  • Runtime isn't a good metric anyways, I would probably display per parameter ESS/s histograms/boxplots over several runs with less (post warm-up) samples
  • The interesting problems with ODE models tend to only come up with non-toy problems, but they also tend to take much longer to run, making it difficult (I assume) to include them in automatic benchmarks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions