Closed
Description
Hi, I have tried to save the checkpoint and resume training. It seems that the parameters have been loaded, but the result is worse than training from scratch.
Here is the code I modified.
if resume:
checkpoint = torch.load(checkpoint_path)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch']
loss = checkpoint['loss']
model.train()
torch.save({
'epoch': epoch + 1,
'model_state_dict': model.module.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'loss': loss,
}, checkpoint_path)
Metadata
Metadata
Assignees
Labels
No labels