I use two method to infer the result. 1) merge two lora models, sft and RLHF, then use merged model to infer the result 2) directly use adapters to infer the results There two way have a very different result.