Skip to content

fixed expected model output files to match current code, and added README.md to explain details#12

Merged
trobacker merged 3 commits intomainfrom
mctr/update-tests
Oct 6, 2025
Merged

fixed expected model output files to match current code, and added README.md to explain details#12
trobacker merged 3 commits intomainfrom
mctr/update-tests

Conversation

@matthewcornell
Copy link
Copy Markdown
Member

@matthewcornell matthewcornell commented Oct 1, 2025

From @trobacker :

For future reference:

We unplugged the underlying models (lgb.LGBMRegressor and sarix.SARIX -> numpyro, jax, etc.) - which we decided were nondeterministic with respect to OS - with fixed results. We felt that this solution that we came up with (mocking out the underlying models) has value because it's still testing the code in idmodels, and assuming the underlying models are tested within their own packages.

We essentially retested/rediscovered that the reproducibility of model_output files was not working across different hardware. We found that it didn't seem to be anything about setting rng at a high level in the code but somewhere in the modeling pieces/source code, (i.e. something within jax or lightgbm somewhere) it seemed to have differences across hardware.

Given that these models have performed well in the past, I don't think this is a reason to not use them, so we used the patch approach to get the tests to pass for now keeping in mind that there is this odd irreproducibility across hardware.

We might want to consider keeping track of what hardware we run models on (docker?) due to this issue.


@matthewcornell
Copy link
Copy Markdown
Member Author

@trobacker and I are debugging the failing tests.

…eterministic results across OSs, in spite of seeds that are consistent across OSs, etc.
Copy link
Copy Markdown
Contributor

@trobacker trobacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

@trobacker
Copy link
Copy Markdown
Contributor

For future reference:

We unplugged the underlying models (lgb.LGBMRegressor and sarix.SARIX -> numpyro, jax, etc.) - which we decided were nondeterministic with respect to OS - with fixed results. We felt that this solution that we came up with (mocking out the underlying models) has value because it's still testing the code in idmodels, and assuming the underlying models are tested within their own packages.

We essentially retested/rediscovered that the reproducibility of model_output files was not working across different hardware. We found that it didn't seem to be anything about setting rng at a high level in the code but somewhere in the modeling pieces/source code, (i.e. something within jax or lightgbm somewhere) it seemed to have differences across hardware.

Given that these models have performed well in the past, I don't think this is a reason to not use them, so we used the patch approach to get the tests to pass for now keeping in mind that there is this odd irreproducibility across hardware.

We might want to consider keeping track of what hardware we run models on (docker?) due to this issue.

@trobacker trobacker merged commit 35b7752 into main Oct 6, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants