Suppose I have a trained neural net
nn1 = Flux.Chain(Dense(90, 60, relu), Dense(60, 30, relu), Dense(30,1)).
I am aware that I can access the weights corresponding to the first 2 layers by params(nn1[1]), params(nn1[2]).
However, I then set up
nn2 = Flux.Chain(Dense(90, 60, relu), Dense(60, 30, relu), Dense(30,2))
.
Note that nn1 and nn2 have the same structure in first 2 layers.
Can I initialize the weights of the first two layers with
params(nn2[1]) = params(nn1[1])
params(nn2[2]) = params(nn1[2])
I wonder what is the most idiomatic answer to this question, but you may go to low level.
Since Dense
is just an ordinary immutable structure which contains data, you can change it’s weights directly.
nn2[1].W .= nn1[1].W
nn2[1].b .= nn1[1].b
or you can wrap it (unsafe) function
function Base.:setindex!(c::Chain, d::Dense, i)
c.layers[i].W .= d.W
c.layers[i].b .= d.b
end
nn2[1] = nn1[1]
nn2[2] = nn1[2]
But I think it’s better to construct nn2
from nn1
directly
nn2 = Flux.Chain(deepcopy(nn1[1:2])..., Dense(30, 2))
Also you may use loadparams!
function.
Flux.loadparams!(nn2[1:2], params(nn1[1:2]))
It is more generic approach, since it doesn’t rely on internal structure of Dense
. If you want, you may add some sugar
Base.:setindex!(c::Chain, x, i) = Flux.loadparams!(c[i], params(x))
nn2[1:2] = nn1[1:2]