reproducing CIFAR10 results for AutoSlim

Hi Jiahui,

Thanks for the great work. I'm trying to reproduce AutoSlim for CIFAR-10 (Table 2).
Could you please provide a detailed hyperparameter you used for it?

I'm able to train the baseline MobileNetV2 1.0x to 7.9 Top-1 error using the following hyperparameters:
- 0.1 initial learning rate
- linear learning rate decay
- 128 batch size
- 300 epochs of training
- 5e-4 weight decay
- 0.9 nesterov momentum
- no label smoothing
- no weight decay for bias and gamma

To train AutoSlim, I use MobileNetV2 1.5x with the exact same hyperparameter but only trained for 50 epochs on a training set (80% of the real training set). Then, during greedy slimming, I use the extra 20% training set as a validation set to decide channel counts. For greedy slimming, I shrink each layer by a step of 10%, which makes it 10 groups as mentioned in the paper.

The final architecture is trained with the same hyperparameters listed above. But I failed to obtain Top-1 error 6.8% as reported in the paper. I'm getting around 7.8%.

Could you please share with me the final architecture for AutoSlim-MobileNetV2 CIFAR-10 with 88MFLOPs? Also, it would be great if you can let me know the hyperparameters you used for CIFAR experiments.

Thanks,
Rudy


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reproducing CIFAR10 results for AutoSlim #40

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

reproducing CIFAR10 results for AutoSlim #40

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions