Add heuristic to group norm to automatically choose num groups. 

When making modifications to a model and training from scratch, it is often better to use group norm instead of batch norm due to smaller batch sizes allowed by detection models on single GPUs.

If you simply take an existing detection model from mmdet and change "BN" to "GN", you will get errors because the "num_groups" attribute needs to be an exact multiple of the number of features in the layer. 

While it would be best to explicitly set num_groups intelligently for each layer, I think there is a simple heuristic that can make prototyping and swapping normalization layers easier. We could easily allow "num_groups" to be a special code (e.g. "auto") and in that case we could automatically choose a number of groups that is reasonable given the layer. 

Consider the current `build_norm_layer` function: 

```python

def build_norm_layer(cfg, num_features, postfix=''):
    """Build normalization layer.

    Args:
        cfg (dict): The norm layer config, which should contain:

            - type (str): Layer type.
            - layer args: Args needed to instantiate a norm layer.
            - requires_grad (bool, optional): Whether stop gradient updates.
        num_features (int): Number of input channels.
        postfix (int | str): The postfix to be appended into norm abbreviation
            to create named layer.

    Returns:
        (str, nn.Module): The first element is the layer name consisting of
            abbreviation and postfix, e.g., bn1, gn. The second element is the
            created norm layer.
    """
    if not isinstance(cfg, dict):
        raise TypeError('cfg must be a dict')
    if 'type' not in cfg:
        raise KeyError('the cfg dict must contain the key "type"')
    cfg_ = cfg.copy()

    layer_type = cfg_.pop('type')
    if layer_type not in NORM_LAYERS:
        raise KeyError(f'Unrecognized norm type {layer_type}')

    norm_layer = NORM_LAYERS.get(layer_type)
    abbr = infer_abbr(norm_layer)

    assert isinstance(postfix, (int, str))
    name = abbr + str(postfix)

    requires_grad = cfg_.pop('requires_grad', True)
    cfg_.setdefault('eps', 1e-5)
    if layer_type != 'GN':
        layer = norm_layer(num_features, **cfg_)
        if layer_type == 'SyncBN':
            layer._specify_ddp_gpu_num(1)
    else:
        assert 'num_groups' in cfg_
        layer = norm_layer(num_channels=num_features, **cfg_)

    for param in layer.parameters():
        param.requires_grad = requires_grad

    return name, layer

```

we could insert code after the `assert 'num_groups' in cfg_` to allow the setting of "num_groups" to be "auto". Perhaps num_groups could be a dictionary that allows for more specific parameters related to which heuristic you want to choose, but in this case I just did something simple: 

Enumerate all the group sizes that would be valid for a layer and construct a list of "info" dictionaries that contain the number of channels per group given the number of groups. I consider the "ideal" number of groups and channels to be something like the square root of the number of total features. I then use a simple heuristic which takes the product of the absolute difference between the number of groups and channels and this ideal for each candidate and chooses the number of groups that minimizes this heuristic (using 1 - the number of groups as a tiebreaker). 

```python
            if cfg_['num_groups'] == 'auto':
                valid_num_groups = [
                    factor for factor in range(1, num_features)
                    if num_features % factor == 0
                ]
                infos = [
                    {'ng': ng, 'nc': num_features / ng}
                    for ng in valid_num_groups
                ]
                ideal = num_features ** (0.5)
                for item in infos:
                    item['heuristic'] = abs(ideal - item['ng']) * abs(ideal - item['nc'])
                chosen = sorted(infos, key=lambda x: (x['heuristic'], 1 - x['ng']))[0]
                cfg_['num_groups'] = chosen['ng']
```

There are lots of ways you could automatically choose a setting for num_groups that is reasonable and feasible given num_features, but I found this method to work reasonably well.

If there is interest in a feature like this, I can make the PR. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add heuristic to group norm to automatically choose num groups. #688

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add heuristic to group norm to automatically choose num groups. #688

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions