Skip to content

[MiniLLM] Why the teacher prob are not extract offline? #328

@huge123

Description

@huge123

Thanks for sharing this great work.

But I have a question: As the title suggests, why do not we extract the teacher in advance to save memory and training time, as the input training prompts are loaded from a fixed dataset?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions