Open
Description
When I set moe_loss_weight:0
[rank7]: File "/home/syx/miniconda3/envs/lmf/lib/python3.11/site-packages/composer/trainer/trainer.py", line 2907, in <lambda>
[rank7]: **kwargs: self._train_microbatches(microbatches, loss_dict, **kwargs).item(),
[rank7]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank7]: File "/home/syx/miniconda3/envs/lmf/lib/python3.11/site-packages/composer/trainer/trainer.py", line 3075, in _train_microbatches
[rank7]: microbatch_loss_dict = self._train_microbatch(use_grad_scaling, current_batch_size, is_final_microbatch)
[rank7]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank7]: File "/home/syx/miniconda3/envs/lmf/lib/python3.11/site-packages/composer/trainer/trainer.py", line 3209, in _train_microbatch
[rank7]: microbatch_loss_dict[k] = loss.detach().clone().mean() * (microbatch_size / current_batch_size)
[rank7]: ^^^^^^^^^^^
[rank7]: AttributeError: 'float' object has no attribute 'detach'