'Advanced Machine Learning' Project

In French: Machine Learning Avancé (MLA)

The goal of this project is to reproduce the results of a recent research paper in the field of Deep Learning (DL). Our group has chosen the paper "Net2Net: Accelerating Learning via Knowledge Transfer" by Chen et al. (2016) which introduces Net2Net techniques, aiming at transferring quickly the knowledge from a previously trained network to a wider or deeper one.

Quick Start

To create and activate a virtual conda environment, run the following commands:

conda create -n mla python=3.10
conda activate mla

Then, install a version of PyTorch with CUDA support (if you have an NVIDIA GPU), compatible with your version of NVIDIA GPU driver. Have a look at the PyTorch website to find the right command to run. For instance, if you have an NVIDIA GPU with CUDA 11.8 support, run the following command:

pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118

Install the other dependencies by running the following command in the root directory of the project:

pip3 install -r requirements.txt

To execute the Jupyter notebooks, you will need to install the ipykernel package in the conda environment and to create a kernel for this environment:

pip3 install ipykernel
python -m ipykernel install --user --name=mla

Try to run the demo notebooks src/inceptionv2_cifar/demo_net2wider.ipynb and src/inceptionv2_cifar/demo_net2deeper.ipynb to check that everything is working fine.

To exit the virtual environment, and remove it if needed, run the following commands:

conda deactivate
conda remove -n mla --all

Note:

All the scripts in this repository should be run from the root directory of the project (for paths to be resolved correctly). For instance, to run the script src/inceptionv2_cifar/main.py, run the following command:

python3 src/inceptionv2_cifar/main.py

Net2Net

Context of the study

When it comes to developing a neural network to perform a specific task, it is common to start with a relatively simple network architecture and then make it more complex to achieve better performance. In such a workflow, each new architecture is generally trained from scratch, and does not take advantage of what has been learned from previous architectures. This is costly in both time and money. To address this problem, the authors of this paper propose a strategy for transferring information learned by a neural network to a larger neural network. In this way, the latter takes less time to train.

Beyond this application, the authors also raise the idea of using their approach to develop lifelong learning systems. Indeed, it is common to seek to increase the capabilities of an existing neural network by training it on a larger database, and it is essential in this case to increase the complexity of the model to capture the larger distribution of data. Again, rather than training a new, more complex architecture from scratch, it is better to take advantage of the knowledge acquired by the previous model.

Main idea of the paper

The main idea of the paper is "function-preserving initializations". In other words, after training a teacher network, we aim to create a more complex student network, whose weight initialization is such that it produces the same outputs as the teacher network. So, by updating the weights as we move in the direction of the gradient, we are guaranteed to have a student network at least as good as the teacher network.

Proposed techniques

The authors propose two techniques to increase the complexity of a neural network. The first, called Net2WiderNet, makes it possible to increase the number of neurons in a layer (or equivalently with CNNs, the number of filters per convolution), without modifying the network predictions (this method can then be applied at several layers of the network). The second, called Net2DeeperNet, allows you to increase the number of layers of the network, without modifying the network predictions. These two techniques can be combined to obtain a student network that is wider and deeper than the teacher network.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

'Advanced Machine Learning' Project

Quick Start

Note:

Net2Net

Context of the study

Main idea of the paper

Proposed techniques

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

tomravaud/MLA-Project

Folders and files

Latest commit

History

Repository files navigation

'Advanced Machine Learning' Project

Quick Start

Note:

Net2Net

Context of the study

Main idea of the paper

Proposed techniques

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages