We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
https://github.com/microsoft/DeepSpeedExamples/blob/ab4e2e54620d0e80ead128b30dd39d9d55751eab/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/main.py#L263
Score and accuracy division should be out of for loop.