-
Notifications
You must be signed in to change notification settings - Fork 123
Open
Description
There seems to be a mismatch in how the model is loaded during resume.
See the original code:
Lines 906 to 908 in 029e2fb
| if model_dir_to_resume: | |
| encoder.load_state_dict(torch.load(os.path.join(model_dir_to_resume, "encoder.pt"))) | |
| kb_config = KBLaMConfig.from_pretrained(os.path.join(model_dir_to_resume, "kb_config.json")) |
This needs to be changed to
if model_dir_to_resume:
encoder.load_state_dict(torch.load(os.path.join(model_dir_to_resume, "_encoder/encoder.pt")))
kb_config = KBLaMConfig.from_pretrained(os.path.join(model_dir_to_resume, "config.json"))
# or kb_config = KBLaMConfig.from_pretrained(model_dir_to_resume)Why this change is neede:
• encoder.pt is saved inside an other folder _encoder/
• the config is saved as config.json, not kb_config.json
ThomasHoppe, grctest and shiwanghua
Metadata
Metadata
Assignees
Labels
No labels