Skip to content

GRPOTrainer: liger_kernel loss implementation ignores PEFT/LoRA adapters on lm_head #4612

@junho328

Description

@junho328

I noticed a compatibility issue in GRPOTrainer when using use_liger_loss=True combined with a PEFT (LoRA) model where the lm_head is targeted for training.

In the compute_liger_loss method, the code directly passes the lm_head.weight of the unwrapped model to the LigerFusedLinearGRPOLoss.

def compute_liger_loss(self, unwrapped_model, inputs):
    # ...
    last_hidden_state = self._get_last_hidden_state(...)

    # Issue: unwrapped_model.lm_head.weight is the frozen base weight when LoRA is active
    loss, metrics = self.liger_grpo_loss(
        _input=last_hidden_state,
        lin_weight=unwrapped_model.lm_head.weight, 
        selected_token_ids=completion_ids,
        # ...
    )
    # ...

If the user configures LoRA with target_modules including "lm_head", the unwrapped_model.lm_head becomes a LoraLayer.

In this case:

  1. unwrapped_model.lm_head.weight refers to the frozen base model weights.
  2. The trainable parameters are in the separate LoRA adapters (lora_A, lora_B).
  3. Since LigerFusedLinearGRPOLoss computes the operation using only the provided lin_weight, the calculation effectively ignores the LoRA adapters.

The LoRA adapters attached to the lm_head will not contribute to the loss calculation and, consequently, will not receive correct gradient updates. The model will behave as if the lm_head is frozen, even though the user intended to train it.

If lm_head is adapted via PEFT, GRPOTrainer should either:

  • Raise a warning or error preventing the usage of use_liger_loss=True when lm_head is in target_modules.
  • Or, handle the merging of weights before passing them to the Liger kernel (though this might negate the memory benefits of using Liger).

Metadata

Metadata

Assignees

No one assigned

    Labels

    ⚡ PEFTRelated to PEFT🏋 GRPORelated to GRPO🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions