Address stochasticity in training

The current models give rather different outcomes upon re-training. One possibility is that we aren't fully optimizing the models. One could imagine working harder to optimize the models using something like [CyclicLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CyclicLR.html). OTOH, that's at odds with an early-stopping regularization procedure if we wanted to go there.