Skip to content

Implement Knowledge distillation by Functional Mapping #121

@manncodes

Description

@manncodes

Description

  • Paper focuses on 2 important aspects of Knowledge Distillation: Consistency & Patience.
  • In function matching, the authors quote knowledge distillation shouldn’t just be about matching the predictions on this target data and you should try to increase the support of the data distribution. So what they use here is something called mixup augmentation, you can use out-of-domain data or this sort of mix-up data way of interpreting between data points to match the function across the data distribution with an interesting view of the sample.
  • Another component of the Knowledge distillation training recipe is patience. Knowledge distillation benefits from long training schedules.
    Results:
    image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions