I would like to know about the hardware you used to train each model type and the training time. Can you provide me with these details? PS. I can't find it in your paper.