Skip to content
This repository was archived by the owner on Sep 18, 2024. It is now read-only.
This repository was archived by the owner on Sep 18, 2024. It is now read-only.

Error when running Resnet18 with Slim Pruner #3947

Open
@crawlingcub

Description

@crawlingcub

Describe the issue:

SlimPruner runs into an error when specifying a small sparsity level. I am training ResNet18 with imagenet dataset.

Config: [{'sparsity': 3.9214609171977314e-06, 'op_types': ['BatchNorm2d']}]

I am using sparsifying_training_epochs=3

Please let me know if you need more details.

Error log:

Traceback (most recent call last):
....
  File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/nni/algorithms/compression/pytorch/pruning/iterative_pruner.py", line 89, in compress
    self.update_mask()
  File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/nni/algorithms/compression/pytorch/pruning/dependency_aware_pruner.py", line 78, in update_mask
    super(DependencyAwarePruner, self).update_mask()
  File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/nni/compression/pytorch/compressor.py", line 339, in update_mask
    masks = self.calc_mask(wrapper, wrapper_idx=wrapper_idx)
  File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/nni/algorithms/compression/pytorch/pruning/dependency_aware_pruner.py", line 65, in calc_mask
    sparsity=sparsity, wrapper=wrapper, wrapper_idx=wrapper_idx)
  File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/nni/algorithms/compression/pytorch/pruning/structured_pruning_masker.py", line 702, in calc_mask
    self._get_global_threshold()
  File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/nni/algorithms/compression/pytorch/pruning/structured_pruning_masker.py", line 695, in _get_global_threshold
    all_bn_weights.view(-1), k, largest=False)[0].max()
RuntimeError: operation does not have an identity.
/pytorch/aten/src/THC/THCTensorTopK.cuh:107: gatherTopK: block: [0,0,0], thread: [992,0,0] Assertion `writeIndex < outputSliceSize` failed.
/pytorch/aten/src/THC/THCTensorTopK.cuh:107: gatherTopK: block: [0,0,0], thread: [993,0,0] Assertion `writeIndex < outputSliceSize` failed.
....

Environment:

  • NNI version: 2.3
  • Training service (local|remote|pai|aml|etc): local
  • Client OS: ubuntu 18.04
  • Python version: 3.7
  • PyTorch/TensorFlow version: PyTorch 1.81
  • Is conda/virtualenv/venv used?: conda
  • Is running in Docker?: no

How to reproduce it?:
Running SlimPruner with above config using Resnet18 model with Imagenet

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions