updating epinow2 version to 1.6.1 by commit hash#324
Conversation
|
Thank you for your contribution @PatrickTCorbett 🚀! Your pkgdown-site is ready for download 👉 here 👈! |
|
Exciting! The changes here look very nice. In the long-superseded #181, I had been thinking of having #163 in before big version jumps. There likely will be changes in performance and/or model fit and it would be good to have eyes on that change. I'm a little nervous merging this into main (rather than something like a release branch) without benchmarking changes in performance. What do you think Micah? Where does this fit into your timeline? |
|
I understand the rationale that you want to make sure changes don't break the underlying model fits. I don't agree that we want to run chains as long as referenced in #163. I think CI needs to stay light and having a model fit that runs that many iterations is going to be pushing multiple minutes. Which parameters are you most concerned with testing (the accept_stat, n high r_hat, something else? A general test I can imagine which would run quickly would be along the lines of having a chain run with ~250 samples and testing that the mean accept stat is above a generic threshold (something like 0.9). Let me know if you have other thoughts |
|
I think we're talking past each other a little bit. My concern is around merging a big version bump to main for tomorrow's production run without an actual test of the impact on the model fits. The package's tests (and your proposed new test) ensure the model will run, but not the quality of the runs -- we could be getting nonsense and be none the wiser. Some off-the-top-of-my-head options for how to handle this change: One option would be to run on simulated data, like I describe in the issue. I like this idea because we can compare the Rt estimates to a known truth and see that the model continues to recover the known truth. Another option would be to compare the model to past (pre-version jump) model fits. We could use Nate's new backtesting infrastructure and see if the results are similar. @natemcintosh or @kgostic might have some other things come to mind? |
|
Not trying to past talk each other just trying to clarify the ask. I think we are saying something very similar if anything. Can you clarify which known truth you are hoping to test, the rt value, a diagnostic like mean_accept_stat, something else, or both diagnostics and actual values? |
|
The |
athowes
left a comment
There was a problem hiding this comment.
Bumping to a new version seems good to me. Perhaps I am more naive or optimistic than Zach, but I'd be suprised if this would lead to worse performance / getting nonsense. EpiNow2 has tests and so on. We also have some tests here.
Regarding the changes to the tests made in this PR, I don't especially understand the rationale for them. (In particular test-fit_model.R and test-model_rt.R.)
If the rationale is to "add more" testing in some sense, I'd start with:
- fit multiple models in
setup.R - test that all the diagnostics indicate a "good" model fit for these multiple models
If you're looking to go further, one could do the simulate data then check recovery. I think that might be quite a lot to do and it could be fine to rely on EpiNow2 to do that. We did do those types of test in epidist (see blog post, or code). However epidist is a modelling package rather than a wrapper around a modelling package (and so it justifies doing more tests there).
|
Ready for review |
natemcintosh
left a comment
There was a problem hiding this comment.
A few questions and small suggestions, but mostly looks pretty good.
|
Ok, I think this looks good. I just ran another test (comparing the 2025-03-12 release to the map_data.csv for this branch), and the results were all close within sampler error. Let's go ahead and just merge this after Thursday! |
|
I also ran my comparison script for runs on June 30th (this branch and main) with identical results so should be good to merge |
|
Thanks for all your reviews. Glad to see we got this merged in! |
* updating epinow2 version to 1.6.1 by commit hash * switching dist_spec to nonparametric * update news md * updating tests * updating tests * roxygenizing * updating news with acurate info * adding mean_accept_stat test * refactoring model fit tests * init test for rt * updating diagnostic test expected values * updating tests * moving rt tests to fit model tests * adding regular sampler params ack in * moving all fits into setup * updating expected tests * testing some different toy sampler options * adding long warmup to test * updating tests * updating tests * removing p divergent test * updating tests * renaming rt to r0 to be correct * misunderstood r0 / effective repr. number and swapping to actually compare current rt in test * Update tests/testthat/setup.R Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> * Update tests/testthat/test-fit_model.R Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> * updating tests to pull rt * explicitly pulling out most recent data in gostic_data * refactoring to remove magrittr pipe in favor of base R pipe * fixing typo in test name --------- Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> Co-authored-by: Patrick Corbett <75850221+PatrickTCorbett@users.noreply.github.com>
Updating the build to pull in version 1.6.1 from EpiNow2 (update from 1.4.1.9000). This moves us to a version which is 1 year newer than we were previously running. Refactoring code to accomodate the changes in how EpiNow2 processes, such as
Versioning Details / Dates ...
1.6.1 --> Oct, 2024
1.4.1.9000 --> Nov. 2023
Update: Based on improving upon #163, we will also research a reasonable test on the gostic_toy_rt dataset to test for a reasonable estimate of the true rt on march 1st (synthetic data) of ~1.98