-
Notifications
You must be signed in to change notification settings - Fork 119
Description
I'm a PhD candidate of Jonathan Berant's, and we are trying to continue
training from a saved checkpoint using your model DocumentQA.
Is this option supported in the code? and what is the best way to do this?
To be more specific : we use ablate_triviaqa_unfiltered.py as our training script.
and it seems "checkpoint" and "parameter_checkpoint" should support this function.
However it is unclear why there are to different variables for that, and why are they called twice:
in _train_async() in trainer.py:
Line 501: (notive that checkpoint is saved and not
parameter_checkpoint is this a bug? )
if parameter_checkpoint is not None:
print("Restoring parameters from %s" % parameter_checkpoint)
saver = tf.train.Saver()
saver.restore(sess, checkpoint)
saver = None
Line 351:
if checkpoint is not None:
print("Restoring from checkpoint...")
saver.restore(sess, checkpoint)
print("Loaded checkpoint: " + str(sess.run(global_step)))
else:
print("Initializing parameters...")
sess.run(tf.global_variables_initializer())
Thanks!