-
Notifications
You must be signed in to change notification settings - Fork 225
Open
Labels
bugSomething isn't workingSomething isn't working
Description
What happens?
Training the same model on the same data (using the DuckDB backend) results in slightly different model parameters (m/u probabilities) in each run. The differences are very small (<1e-10) so likely just some floating point issue.
Although the differences are small, it makes development of linkage models a bit more cumbersome because the parameters will always change slightly and it's not immediately clear where differences between runs come from.
I'm currently often resorting to rounding the model parameters in splink.Linker._settings_obj after training (to 10 decimals which seems to work reliably) but this feels like a hack.
To Reproduce
reproduce_em_nondeterminism.py
OS:
MacOS
Splink version:
4.0.12
Have you tried this on the latest master branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working