fixed expected model output files to match current code, and added README.md to explain details by matthewcornell · Pull Request #12 · reichlab/idmodels

matthewcornell · 2025-10-01T21:13:37Z

From @trobacker :

For future reference:

We unplugged the underlying models (lgb.LGBMRegressor and sarix.SARIX -> numpyro, jax, etc.) - which we decided were nondeterministic with respect to OS - with fixed results. We felt that this solution that we came up with (mocking out the underlying models) has value because it's still testing the code in idmodels, and assuming the underlying models are tested within their own packages.

We essentially retested/rediscovered that the reproducibility of model_output files was not working across different hardware. We found that it didn't seem to be anything about setting rng at a high level in the code but somewhere in the modeling pieces/source code, (i.e. something within jax or lightgbm somewhere) it seemed to have differences across hardware.

Given that these models have performed well in the past, I don't think this is a reason to not use them, so we used the patch approach to get the tests to pass for now keeping in mind that there is this odd irreproducibility across hardware.

We might want to consider keeping track of what hardware we run models on (docker?) due to this issue.

…ADME.md to explain details

matthewcornell · 2025-10-01T21:30:42Z

@trobacker and I are debugging the failing tests.

…eterministic results across OSs, in spite of seeds that are consistent across OSs, etc.

trobacker

Yes!

trobacker · 2025-10-03T17:47:28Z

For future reference:

We unplugged the underlying models (lgb.LGBMRegressor and sarix.SARIX -> numpyro, jax, etc.) - which we decided were nondeterministic with respect to OS - with fixed results. We felt that this solution that we came up with (mocking out the underlying models) has value because it's still testing the code in idmodels, and assuming the underlying models are tested within their own packages.

We essentially retested/rediscovered that the reproducibility of model_output files was not working across different hardware. We found that it didn't seem to be anything about setting rng at a high level in the code but somewhere in the modeling pieces/source code, (i.e. something within jax or lightgbm somewhere) it seemed to have differences across hardware.

Given that these models have performed well in the past, I don't think this is a reason to not use them, so we used the patch approach to get the tests to pass for now keeping in mind that there is this odd irreproducibility across hardware.

We might want to consider keeping track of what hardware we run models on (docker?) due to this issue.

fixed expected model output files to match current code, and added RE…

ff15b24

…ADME.md to explain details

matthewcornell added 2 commits October 2, 2025 18:15

changed unit tests to patch areas of code that ultimately invoke nond…

2a4daeb

…eterministic results across OSs, in spite of seeds that are consistent across OSs, etc.

fix ruff checks

554c17e

trobacker approved these changes Oct 2, 2025

View reviewed changes

trobacker merged commit 35b7752 into main Oct 6, 2025
1 check passed

trobacker mentioned this pull request Oct 7, 2025

Test Model Reproducibility elray1/sarix#14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixed expected model output files to match current code, and added README.md to explain details#12

fixed expected model output files to match current code, and added README.md to explain details#12
trobacker merged 3 commits intomainfrom
mctr/update-tests

matthewcornell commented Oct 1, 2025 •

edited

Loading

Uh oh!

matthewcornell commented Oct 1, 2025

Uh oh!

trobacker left a comment

Uh oh!

trobacker commented Oct 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matthewcornell commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

From @trobacker :

Uh oh!

matthewcornell commented Oct 1, 2025

Uh oh!

trobacker left a comment

Choose a reason for hiding this comment

Uh oh!

trobacker commented Oct 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

matthewcornell commented Oct 1, 2025 •

edited

Loading