Skip to content

Do you have plans on adding megatron support for contrastive learning? #7566

@Lossfull

Description

@Lossfull

Describe the feature
So as the titles says, I was very excited to train big foundation MoE embedding model using ms-swift framework, as it has both infonce loss support and expert parallel megatron support. Only later i found out that those two things are not compatible. Is there any chance you can add support for task_type = embeddings and infonce loss for megatron?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions