I hope you are well.
In Figure 4 of your paper, you show that the time overhead of pausing VGG19 is roughly ten times smaller than that of pausing ResNet152. However, VGG19 has more than twice the number of parameters compared to ResNet152. Can you please explain this time difference? Shouldn't pausing ResNet152 be faster than pausing VGG19? If you consider the complexity of architecture, how would it affect the pausing time?