Describe your feature request
Any chance we could get the pretrained ViT model with MAE?
Describe the reference code or paper
Masked Autoencoders Are Scalable Vision Learners
Describe the possible solution
Please pretrain the ViT with MAE and upload the weight link to the readme file. Thanks in advance!
Additional context
None.