I reviewed the code for training and inference for RAM++ and noticed a potential bug regarding the normalizationbetween these two stages:
- During inference, the normalization parameters used are:
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
- During training, the normalization parameters used are:
mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711]
Tested this on my custom dataset which contains 20 classes, I found that aligning the normalization parameters brings an improvement in AP. Not a huge improvement though:
- From 0.51594603 to 0.5217806.