Using LLaMA in reward model training

Hi, 

I have encounter the **TypeError: LlamaModel.forward() got an unexpected keyword argument 'head_mask'** error when training the LLaMA-7B model in step 2 reward model training. 

I was wondering if the head_mask is used at all in training the reward model?

Also, is there a quick fix for this error?

Many thanks