Skip to content

Enable "apply_lora_to_output" in models with tied embedding #1960

Open
@felipemello1

Description

Many families of models with small models (<=3B parameters) have tied embeddings, meaning that the output layer projection is the same as the input tok_embeddings. Examples are gemma, qwen and llama 3.2.

These models currently don't support "apply_lora_to_output". This happens because in the past we used to pass as the output_proj a lambda function, e.g. lambda x: x @ tok_embeddings.weight.

Recently, we changed it, and started passing the TiedLinear module.. We need the TiedLinear so it can work well with FSDP and other techniques.

This task is to enable LoRA on top of this TiedLinear, like we do it for nn.Linear in other models that do not have tied embeddings, e.g. in llama 3.1.

After adding this feature, the configs of models from llama 3.2, qwen and gemma have to be updated to include the flag.

Metadata

Assignees

No one assigned

    Labels

    community help wantedWe would love the community's help completing this issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions