Support for Transfer-Learning/Layer-Freezing

### Motivation and description

A common practice in machine learning is to take a pre-trained model and fine-tune it on a particular dataset. This typically involves freezing the weights in some layers while fitting the output layer(s) on the new data.

Unfortunately, this functionally appears to be incompatible with the current implementation of the `ToDevice` callback based on the following code:

```julia
function on(::EpochBegin, ::Phase, cb::ToDevice, learner)
    model!(learner, cb.movemodelfn(learner.model))
end

function model!(learner, model)
    learner.model = model
    learner.params = setupoptimstate(model, learner.optimizer)
end

setupoptimstate(model, ::Flux.Optimise.AbstractOptimiser) = Flux.params(model)

setupoptimstate(model, optim) = Optimisers.setup(optim, model)
```

This essentially means that `learner.params` is set to the parameters of the full model at the start of each epoch. Thus, even if we try to freeze the layers manually with `Flux.freeze!(learner.params.layers[1:end-1])`, this will be undone by `ToDevice`.

### Possible Implementation

One solution that would work with Flux's new explicit optimizers would be to create a callback to freeze layers after `ToDevice` is executed. An example is given below:

```julia
mutable struct LayerFreezing{F} <: FluxTraining.Callback
    accessor::F
end

function FluxTraining.stateaccess(scheduler::LayerFreezing)
    return (;params = FluxTraining.Write())
end

function FluxTraining.on(
    event::FluxTraining.EpochBegin, 
    phase::FluxTraining.AbstractTrainingPhase, 
    freezer::LayerFreezing, 
    learner)
    Flux.freeze!(freezer.accessor(learner.params))
end

FluxTraining.runafter(::LayerFreezing) = (FluxTraining.ToDevice,)
```

However, perhaps we should consider whether it's necessary for `ToDevice` to move the model to the GPU at the start of every epoch. Maybe we could extend the `Callback` interface to allow for some one-time setup code to run before the first epoch is executed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Support for Transfer-Learning/Layer-Freezing #150

Motivation and description

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

Support for Transfer-Learning/Layer-Freezing #150

Description

Motivation and description

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions