Skip to content

reproducing CIFAR10 results for AutoSlim #40

Open
@RudyChin

Description

@RudyChin

Hi Jiahui,

Thanks for the great work. I'm trying to reproduce AutoSlim for CIFAR-10 (Table 2).
Could you please provide a detailed hyperparameter you used for it?

I'm able to train the baseline MobileNetV2 1.0x to 7.9 Top-1 error using the following hyperparameters:

  • 0.1 initial learning rate
  • linear learning rate decay
  • 128 batch size
  • 300 epochs of training
  • 5e-4 weight decay
  • 0.9 nesterov momentum
  • no label smoothing
  • no weight decay for bias and gamma

To train AutoSlim, I use MobileNetV2 1.5x with the exact same hyperparameter but only trained for 50 epochs on a training set (80% of the real training set). Then, during greedy slimming, I use the extra 20% training set as a validation set to decide channel counts. For greedy slimming, I shrink each layer by a step of 10%, which makes it 10 groups as mentioned in the paper.

The final architecture is trained with the same hyperparameters listed above. But I failed to obtain Top-1 error 6.8% as reported in the paper. I'm getting around 7.8%.

Could you please share with me the final architecture for AutoSlim-MobileNetV2 CIFAR-10 with 88MFLOPs? Also, it would be great if you can let me know the hyperparameters you used for CIFAR experiments.

Thanks,
Rudy

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions