Skip to content

[Feature] Model Adapting Overhaul #169

@soniajoseph

Description

@soniajoseph

This is a thread for refactoring the model weight conversion + config conversion.

I am happy that we’re doing this as the code for loading models is quickly becoming MUCH more orderly and neat!

Currently, I am refactoring the config under the branch sonia_restructure_model_adaption.

The two main new files are:

  1. model_loader.py, which loads the model config and weights
  2. model_config_registry.py, which contains config overrides that are sometimes Prisma relevant. While we try to load the config dynamically, sometimes the config does not contain all the necessary information (e.g. number of attention heads, or whether to normalize the output), so we add it in here.

What still needs to be done
I've already adapted most of our models, but we have some that are missing. It would be great if someone could pick them up and adapt them to the Prisma repo, and then run them on the pytest to see if the adapted model output matches the original.

What needs to be adapted, in order of priority

  1. There's a list of models that are current failing here
  2. There may be more model names in the dev Prisma branch that I didn't add to the list; it would be good to double check.
  3. The VJEPA family which is open source (the encoder + the predictor)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions