Implement Knowledge distillation by Functional Mapping

* Paper: Knowledge distillation: A good teacher is patient and consistent
* Paper Link: https://arxiv.org/abs/2106.05237

### Description
- Paper focuses on 2 important aspects of Knowledge Distillation: Consistency & Patience.
- In function matching, the authors quote knowledge distillation shouldn’t just be about matching the predictions on this target data and you should try to increase the support of the data distribution. So what they use here is something called mixup augmentation, you can use out-of-domain data or this sort of mix-up data way of interpreting between data points to match the function across the data distribution with an interesting view of the sample.
- Another component of the Knowledge distillation training recipe is patience. Knowledge distillation benefits from long training schedules.
***Results:*** 
![image](https://user-images.githubusercontent.com/46739555/131798817-5a5d0d65-5ee9-4096-965c-1a1046666174.png)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Knowledge distillation by Functional Mapping #121

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement Knowledge distillation by Functional Mapping #121

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions