Feature Request: Load pre-trained weights (.pth files etc.) into a flux model #2164

ejmeitz · 2023-01-16T00:13:21Z

Motivation and description

I think it would be useful to load pre-trained weights from PyTorch or Tensorflow. I'm sure this has been discussed before (e.g., in the PyTorch feature parity doc), but I could not an open issue on this.

Possible Implementation

My current work around takes the .pth file and opens it with Pickle.jl. I am still working to parse the resulting dictionary to create a Flux model. It would make my life much easier if there was an associated flux function to just load pre-trained weights and evaluate a model.

ToucheSir · 2023-01-16T01:14:18Z

Are you aware of https://fluxml.ai/Flux.jl/stable/saving/#Flux.loadmodel!? I'm not sure it's our place to be including functionality for converting between PyTorch and Flux model structures given the lack of uniformity on the PyTorch side (excepting specific cases like Metalhead.jl where we have a known, limited set of models to map from).

ejmeitz · 2023-01-16T02:27:17Z

Yes, I believe we discussed this on Slack yesterday haha. That is what I plan on using for my implementation as it is only for a single model. Just thought I'd post an issue here since I also saw it in several open threads online and in the Pytorch feature parity document in this repo.

ToucheSir · 2023-01-16T03:03:13Z

I would say it's partially covered by the "We should expose the possibility to load pretrained weights" point under "PyTorch Extras" in #1431. As for more general solutions, were someone to come up with a general Dict -> nested struct transformation which works with most PyTorch models, we could consider depending/integrating/advertising it on the Flux side.

ejmeitz · 2023-01-16T15:28:22Z

I'll let you know if what I come up with is general enough.

Flux should have all the same layer types & hyperparameters as PyTorch correct (with different names)?

ToucheSir · 2023-01-16T19:21:48Z

Not necessarily, which is another reason thisi s difficult to generalize. In general we try to keep to close to PyTorch if there's no good reason to diverge, but that's not a hard rule.

CarloLucibello · 2023-07-12T11:44:10Z

Some scripts for porting weights can be found in the Metalhead repo https://github.com/FluxML/Metalhead.jl/tree/master/scripts

ToucheSir added the enhancement label Jan 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Load pre-trained weights (.pth files etc.) into a flux model #2164

Feature Request: Load pre-trained weights (.pth files etc.) into a flux model #2164

ejmeitz commented Jan 16, 2023 •

edited

Loading

ToucheSir commented Jan 16, 2023

ejmeitz commented Jan 16, 2023

ToucheSir commented Jan 16, 2023

ejmeitz commented Jan 16, 2023

ToucheSir commented Jan 16, 2023

CarloLucibello commented Jul 12, 2023

Feature Request: Load pre-trained weights (.pth files etc.) into a flux model #2164

Feature Request: Load pre-trained weights (.pth files etc.) into a flux model #2164

Comments

ejmeitz commented Jan 16, 2023 • edited Loading

Motivation and description

Possible Implementation

ToucheSir commented Jan 16, 2023

ejmeitz commented Jan 16, 2023

ToucheSir commented Jan 16, 2023

ejmeitz commented Jan 16, 2023

ToucheSir commented Jan 16, 2023

CarloLucibello commented Jul 12, 2023

ejmeitz commented Jan 16, 2023 •

edited

Loading