Great work!
I noticed that in the appendix, you report that LightningDiT-1p0B/1 (1.03B parameters) has 12.88 GFLOPs at an image resolution of 512.
Could you kindly share the method you used to compute the FLOPs? Additionally, does this number correspond to the FLOPs of a single block, or of the entire model?
I have tried using fvcore and torch.profiler to estimate the FLOPs, but I was unable to reproduce your reported results.
Thank you very much for your time and assistance.