How to apply Transfer Learning with Flux

Mariana · December 4, 2020, 1:24pm

I have a trained neural network, which is already giving good results, and I’d like to use the parameters previously obtained for the model as the starting parameters to train the ANN on a new task (basically, it’s a transfer learning problem). So far, I’m training the network by using the following function"

function flux_training(x_train::Array{Float64,2}, y_train::Array{Float64,2}, n_epochs::Int, lambda::Int)
    model = Chain(Dense(54,54,sigmoid),Dense(54,54,sigmoid),Dense(54,12,leakyrelu))
    loss(x,y) = Flux.mse(model(x),y) 
    ps = params(model)
    dataset = Flux.Data.DataLoader(x_train', y_train', batchsize = 32, shuffle = true)
    opt = Optimiser(WeightDecay(lambda), ADAGrad())
    evalcb() = @show(loss(x_train', y_train'))
    for epoch in 1:n_epochs
        println("Epoch $epoch")
        time = @elapsed Flux.train!(loss, ps, dataset, opt, cb=throttle(evalcb,3))
    end
    
    y_hat = model(x_train')' 

    return y_hat, model

end

and I save the model created by doing:

weights = params(model)
using BSON: @save
@save "mymodel.bson" weights

How can I initialize the weights in my training function as the values that were previously saved, to train the ANN for a new task?

andrewdinhobl · December 4, 2020, 4:01pm

I’m not sure how up-to-date it is, but there is an example of transfer learning in the Flux model zoo.

contradict · December 4, 2020, 5:32pm

That example shows a nice way to freeze part of the weights and train the rest, if you just want to reload and retrain all the weights the documentation has an example.

The model weights are initialized to random values (usually, you can modify that too if you need) when the layers are constructed. If you then load new values or otherwise modify them, that will be the starting point for training.

Mariana · December 4, 2020, 6:03pm

Thank you!

Mariana · December 4, 2020, 6:08pm

Thank you, @contradict!! This idea of initializing the weights with a pre-defined value, instead of random values, is a nice possibility, but I’m not sure how to do that…

DrChainsaw · December 4, 2020, 6:23pm

Flux has a function loadparams! which replaces params of an existing model. Its a bit clunky to use as you need to keep the code which creates the model structure around.

I think BSON can save the whole Chain so you dont need to do this (i.e. do BSON.@save model instead).

You could also try ONNXmutable.jl for more long term storage.

contradict · December 4, 2020, 10:49pm

It the initializers are not particularly useful for that since you have to specify them at layer creation time. This facility is mostly useful for experimenting with new layer types or perhaps scaling the random initialization to fit some peculiarity of your specific problem. loadparams! or one of the other methods @DrChainsaw mentioned is probably the correct solution to your problem.

Topic		Replies	Views
Can I reconstruct a network using Flux.params in Flux.jl? Machine Learning flux , machine-learning , neural-network	2	652	November 20, 2022
Reset model parameters Flux.jl Machine Learning question , flux	2	1905	February 24, 2020
How to initialize a subset of params of a NN New to Julia flux	2	796	January 16, 2020
Flux: get intermediate results per epoch Machine Learning flux	4	704	February 11, 2021
Flux: Resample model with different initializations Machine Learning flux	9	926	July 9, 2021

How to apply Transfer Learning with Flux

Related topics