PaddlePaddle reimplementation of facebookresearch's repository for the SwAV model that was released with the paper Unsupervised Learning of Visual Features by Contrasting Cluster Assignments.
To enjoy some new features, PaddlePaddle develop is required. For more installation tutorials refer to installation.md
Prepare the data into the following directory:
dataset/
└── ILSVRC2012
├── train
└── val
With a batch size of 4096, SwAV is trained with 4 nodes:
# Note: Set the following environment variables
# and then need to run the script on each node.
unset PADDLE_TRAINER_ENDPOINTS
export PADDLE_NNODES=4
export PADDLE_MASTER="xxx.xxx.xxx.xxx:12538"
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export FLAGS_stop_check_timeout=3600
python -m paddle.distributed.launch \
--nnodes=$PADDLE_NNODES \
--master=$PADDLE_MASTER \
--devices=$CUDA_VISIBLE_DEVICES \
passl-train \
-c ./configs/swav_resnet50_224_pt_in1k_4n32c_dp_fp16o1.yaml
By default, we use momentum-SGD and a batch size of 256 for linear classification on frozen features/weights. This can be done with a single 8-GPU node.
- Download pretrained model
mkdir -p pretrained/swav
wget -O ./pretrained/swav/swav_resnet50_in1k_800ep_bz4096_pretrained.pdparams https://passl.bj.bcebos.com/models/swav/swav_resnet50_in1k_800ep_bz4096_pretrained.pdparams
- Train linear classification model
unset PADDLE_TRAINER_ENDPOINTS
export PADDLE_NNODES=1
export PADDLE_MASTER="127.0.0.1:12538"
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export FLAGS_stop_check_timeout=3600
python -m paddle.distributed.launch \
--nnodes=$PADDLE_NNODES \
--master=$PADDLE_MASTER \
--devices=$CUDA_VISIBLE_DEVICES \
passl-train \
-c ./configs/swav_resnet50_224_lp_in1k_1n8c_dp_fp32.yaml
To perform end-to-end fine-tuning for SwAV:
-
First download the data split text file with following commands:
cd PASSL/dataset/ILSVRC2012 wget "https://raw.githubusercontent.com/google-research/simclr/master/imagenet_subsets/10percent.txt" wget "https://raw.githubusercontent.com/google-research/simclr/master/imagenet_subsets/1percent.txt"
-
Then, download the pretrained models to
./pretrained/swav/swav_resnet50_in1k_800ep_bz4096_pretrained.pdparams
mkdir -p pretrained/swav wget -O ./pretrained/swav/swav_resnet50_in1k_800ep_bz4096_pretrained.pdparams https://passl.bj.bcebos.com/models/swav/swav_resnet50_in1k_800ep_bz4096_pretrained.pdparams
-
Finally, run the training with the trained PASSL format checkpoint:
unset PADDLE_TRAINER_ENDPOINTS export PADDLE_NNODES=1 export PADDLE_MASTER="127.0.0.1:12538" export CUDA_VISIBLE_DEVICES=0,1,2,3 export FLAGS_stop_check_timeout=3600 python -m paddle.distributed.launch \ --nnodes=$PADDLE_NNODES \ --master=$PADDLE_MASTER \ --devices=$CUDA_VISIBLE_DEVICES \ passl-train \ -c ./configs/swav_resnet50_224_ft_in1k_1n4c_dp_fp32.yaml -o Global.pretrained_model=./pretrained/swav/swav_resnet50_in1k_800ep_bz4096_pretrained
We provide more directly runnable configurations, see SwAV Configurations.
Model | Phase | Dataset | Configs | GPUs | Epochs | Top1 Acc (%) | Links |
---|---|---|---|---|---|---|---|
resnet50 | pretrain | ImageNet2012 | config | A100*N2C16 | 800 | - | model | log |
resnet50 | linear probe | ImageNet2012 | config | A100*N1C8 | 100 | 75.3 | model | log |
resnet50 | finetune-semi10 | ImageNet2012 | config | A100*N1C4 | 20 | 69.0 | model | log |
resnet50 | finetune-semi10 | ImageNet2012 | config | A100*N1C4 | 20 | 55.0 | model | log |
@misc{caron2021unsupervised,
title={Unsupervised Learning of Visual Features by Contrasting Cluster Assignments},
author={Mathilde Caron and Ishan Misra and Julien Mairal and Priya Goyal and Piotr Bojanowski and Armand Joulin},
year={2021},
eprint={2006.09882},
archivePrefix={arXiv},
primaryClass={cs.CV}
}