-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
This is a thread for refactoring the model weight conversion + config conversion.
I am happy that we’re doing this as the code for loading models is quickly becoming MUCH more orderly and neat!
Currently, I am refactoring the config under the branch sonia_restructure_model_adaption.
The two main new files are:
model_loader.py, which loads the model config and weightsmodel_config_registry.py, which contains config overrides that are sometimes Prisma relevant. While we try to load the config dynamically, sometimes the config does not contain all the necessary information (e.g. number of attention heads, or whether to normalize the output), so we add it in here.
What still needs to be done
I've already adapted most of our models, but we have some that are missing. It would be great if someone could pick them up and adapt them to the Prisma repo, and then run them on the pytest to see if the adapted model output matches the original.
What needs to be adapted, in order of priority
- There's a list of models that are current failing here
- There may be more model names in the
devPrisma branch that I didn't add to the list; it would be good to double check. - The VJEPA family which is open source (the encoder + the predictor)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels