WIP: Adding Domain adaptation#51
Conversation
|
Just added small fixes to the bugs you found @andrewkern -- one of the bugs was because For the current hack of using the neut.fvec as our fake target domain data I just copied these 2000 data points 5 times to give 10000 obs to match the simulations. With this change and a few array shape fixes the model begins training with |
| if argsDict["domain_adaptation"]: | ||
| empirical = np.loadtxt(trainingDir + "empirical.fvec", skiprows=1) | ||
| emp = np.reshape(empirical, (empirical.shape[0], nDims, numSubWins)) | ||
| emp1 = np.concatenate((emp,emp,emp,emp,emp)) |
There was a problem hiding this comment.
This is the copy 5x line that should be removed in the future when user passes in empirical target domain data the same length of their training set simulations
|
running this now! one warning I'm getting is this has to do with the metrics on the early stopping criterion. |
|
Fixed the callback issue -- have code change from 'val_accuracy' to 'val_predictor_accuracy' for checkpointing and early stopping when using domain adaptation |
…only publish to PyPi with tagged versions
!!This PR is still a WIP!!
Adding Domain Adaptation following what was done for SIA and ReLEARN from https://www.biorxiv.org/content/10.1101/2023.03.01.529396v1 (their code lives at https://github.com/ziyimo/popgen-dom-adapt)
This requires two major changes to diploshic:
That should be it for the major implementation changes. The rest of this PR is small changes to the interfacing script that handles the logic of using the original model by default and then switching to the domain adaptive model with the CLI argument
--domain-adaptationCurrently by default if you turn on domain adaptation then the code assumes that you have .fvec feature vector files created from your target domain data and stored in your training directory named
empirical.fvecCurrent steps left undone: