Hi, I have a small optimization to suggest: Is there any particular reason to not simplify > [line 84] p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32) into > p_data_fp32.mul_(-group['weight_decay'] * group['lr']) ? Other lines could be simplified the same manner.