Conversation
…ADME.md to explain details
|
@trobacker and I are debugging the failing tests. |
…eterministic results across OSs, in spite of seeds that are consistent across OSs, etc.
|
For future reference: We unplugged the underlying models ( We essentially retested/rediscovered that the reproducibility of model_output files was not working across different hardware. We found that it didn't seem to be anything about setting rng at a high level in the code but somewhere in the modeling pieces/source code, (i.e. something within Given that these models have performed well in the past, I don't think this is a reason to not use them, so we used the patch approach to get the tests to pass for now keeping in mind that there is this odd irreproducibility across hardware. We might want to consider keeping track of what hardware we run models on (docker?) due to this issue. |
From @trobacker :
For future reference:
We unplugged the underlying models (
lgb.LGBMRegressorandsarix.SARIX->numpyro,jax, etc.) - which we decided were nondeterministic with respect to OS - with fixed results. We felt that this solution that we came up with (mocking out the underlying models) has value because it's still testing the code inidmodels, and assuming the underlying models are tested within their own packages.We essentially retested/rediscovered that the reproducibility of model_output files was not working across different hardware. We found that it didn't seem to be anything about setting rng at a high level in the code but somewhere in the modeling pieces/source code, (i.e. something within
jaxorlightgbmsomewhere) it seemed to have differences across hardware.Given that these models have performed well in the past, I don't think this is a reason to not use them, so we used the patch approach to get the tests to pass for now keeping in mind that there is this odd irreproducibility across hardware.
We might want to consider keeping track of what hardware we run models on (docker?) due to this issue.