[fix] some param will be set to optimizer twice when using MLM transformer heads#965
[fix] some param will be set to optimizer twice when using MLM transformer heads#965butterluo wants to merge 9 commits intofacebookresearch:mainfrom
Conversation
…ch will be an error in future:
UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
This is cuase by get_optimizer_parameters() function will get some parameters twice if these parameters which belong to backbone were tied to some heads in head's tie_weights() method (eg. the head called MLM)
apsdehal
left a comment
There was a problem hiding this comment.
Thanks for this change and helping make MMF better. I have left some comments and then this PR should be ready to land.
mmf/models/transformers/base.py
Outdated
| def get_optimizer_parameters(self, config): | ||
| lr = config.optimizer.params.lr | ||
|
|
||
| backbone_param_set = set() |
There was a problem hiding this comment.
Let's name this trunk_params_set.
There was a problem hiding this comment.
Thanks for your reviewing. I will change my code according to ur suggestion.
mmf/models/transformers/base.py
Outdated
| self, config, module_name, base_lr, module, parameters, param_list, backbone_param_set = None | ||
| ): | ||
| lr_multiplier = config.get("lr_multiplier", 1.0) | ||
| if backbone_param_set is None: |
There was a problem hiding this comment.
If this is None, make it an empty list, [].
There was a problem hiding this comment.
Thanks for your reviewing. I will change my code according to ur suggestion.
mmf/models/transformers/base.py
Outdated
| lr_multiplier = config.get("lr_multiplier", 1.0) | ||
| if backbone_param_set is None: | ||
| module_param = list(module.named_parameters()) | ||
| else: |
There was a problem hiding this comment.
Now, you can remove this else condition.
There was a problem hiding this comment.
Thanks for your reviewing. I will change my code according to ur suggestion.
|
@apsdehal has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Co-authored-by: Amanpreet Singh <apsdehal@gmail.com>
|
@butterluo has updated the pull request. You must reimport the pull request before landing. |
|
@butterluo has updated the pull request. You must reimport the pull request before landing. |
|
Hi @apsdehal can this bug fix code pass your review? And the auto-merging of ci seems was blocked by the issue https://github.com/facebookresearch/mmf/issues/966 ? If the codes have pass your review how can i merge it into master? |
apsdehal
left a comment
There was a problem hiding this comment.
It looks good to me. Thanks for making the updates. Can you fix lint issues? I will then work on landing it.
|
@butterluo has updated the pull request. You must reimport the pull request before landing. |
|
@apsdehal has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@butterluo has updated the pull request. You must reimport the pull request before landing. |
@apsdehal I've reformated the the bug fix codes using pycharm. Can it fix the lint issues? |
|
@butterluo has updated the pull request. You must reimport the pull request before landing. |
|
@butterluo has updated the pull request. You must reimport the pull request before landing. |
|
summary:
When using MMFTransformer with MLM head, an warning will occur which will be an error in future: "UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information"
This is cuase by get_optimizer_parameters() function will get some parameters twice if these parameters which belong to backbone were tied to head in head's tie_weights() method
Tested locally
Thanks for your contribution!
If you're sending a large PR (e.g., >50 lines), please open an issue first about
the feature/bug, and indicate how you want to contribute.
Use contributing guidelines before opening up the PR to follow MMF style guidelines.