Why certain parameters are in multiple places for Nemo 2.0 #13256
-
I see |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi, thanks for the question, and apologies for the confusion here. These duplicate parameters essentially arise because NeMo handles precision exclusively through the mixed precision plugin, while PyTorch lightning and Megatron have slightly different ways of handling precision. Any precision-related settings in NeMo 2.0 should be specified from |
Beta Was this translation helpful? Give feedback.
Hi, thanks for the question, and apologies for the confusion here. These duplicate parameters essentially arise because NeMo handles precision exclusively through the mixed precision plugin, while PyTorch lightning and Megatron have slightly different ways of handling precision. Any precision-related settings in NeMo 2.0 should be specified from
MegatronMixedPrecision
directly. You can find some examples of various mixed precision plugins here.