Experiment Reproduction of Deep Polynomial Neural Networks
More specifically, the experiment reproduction of Table 5 - ResNet18 on CIFAR10 image classification task. with the reported training setup w/ pytorch lightning and mmpretrain.
Accuracy: accruacy refers to top-1 accuracy
[ ] the reported 94% accuracy is not achieved. Currently at 92%
[ ] #paramters of the prodpoly is 5.2M instead of the reported 6M - might signal a bug in this implementation.
The baseline ResNet18 on CIFAR10 with pytorch lightning in this link is reported to get 93-94% accuracy on CIFAR10 with 40-50 epochs with their learning rate scheduler.
Using the above guide, with the simple lr scheduler and batch size reported in the paper
"Each method is trained for 120 epochs with batch size 128. The SGD optimizer is used with initial learning rate of 0.1. The learning rate is multiplied with a factor of 0.1 in epochs 40; 60; 80; 100."
[p.7 on pdf, just above the tables 4. and 5] achived 88% accuracy though the paper claims ~94%.
(I assume) The paper uses an mmpretrain implementation of ResNet as described in this folder more specifically this passage
The discrepancy could be explained by the difference in the native pytorch implementation and the mm implementation.
That difference is mainly where the relus are located in the backbone and a identity path that appears in the mm implementation but not in the pytorch one. See diff that was generate with the mmpretrain_v_lightning.py script.
✅ Using the mmpretrain implementation with
get_model
and the reported training setup in the paper achived 92% accuracy.
You want to install mm from source as you'll need to modify the backbone to the PiNet version: 1.install it from source - link
Important: You can usee my forked version of mmpretrain as the backbones are already modified. You can look at the commits. They are simple and few.
- install the mmcv with mim (2 commands :)
The modification to the mm ResNet backbones are placed in this folder.
I've used the pinet_relu.py backbone modification, which yeilds (from what I understand) a 2nd order polynomial expansion you can see the diff - orig | pinet relu between the baseline mmpretrain backbone and the modified pinet backbone
Adds (what I assume is) the instance normalization as norm3 and the Hadamard product in lines 135-136 to yeild a 2nd order polynomial expansion (see the second
variable)
- Table 5 in the paper the Prodpoly resnet has [2, 2, 1, 1] residual blocks. Those changes are made in the PiResNet backbone in this commit as a custom depth in
arch_settings
and for the config file that is added in the following commit - notice the"18pi"
depth.
- Used this very simple guide to add the custom pinet backbone with the changes described above and seen in the those commits.
Notice the
resnet18_8xb16_cifar10
config uses aResNet_CIFAR
backbone.
You can also see the diff between the baseline and the pinet model both implemented in mmpretrain with script in the comparisons folder. And its diff