8 Likes
For those still interested in using BFGS (or L-BFGS) to train flux models, I made a small utility package to facilitate this
5 Likes
That’s really cool. I’ll try that out later today.
You could define
randomize!(_) = nothing
function randomize!(d::Dense)
d.W .= randn(Float32, size(d.W))
d.b .= randn(Float32, size(d.b))
end
function randomize!(c::Chain)
for l in c.layers
randomize!(l)
end
end
and now you can easily reset all your dense layers and chains to random normal distributed parameters, even of nested models like
Chain(x -> [x], Dense(1,2,σ),
Chain(Dense(2,7,σ), Dense(7,4,σ)),
Dense(4,1,σ), first)
2 Likes