How to use Flux to write a Multi-Head Output Network

mcabbott · August 26, 2024, 3:14am

What AI doesn’t seem to know is that you need to tell Flux to look for parameters inside, by making the layer with a macro. Otherwise it will not be able to train:

julia> Flux.setup(Adam(), PolicyNetwork(2, 3, 4))
┌ Warning: setup found no trainable parameters in this model
└ @ Optimisers ~/.julia/packages/Optimisers/yDIWk/src/interface.jl:32
()

julia> Flux.@layer PolicyNetwork  # Defines methods for functor, show

julia> Flux.setup(Adam(), PolicyNetwork(2, 3, 4))  # now this sees parameters
(hidden1 = (layers = ((weight = Leaf(Adam(0.001, (0.9, 0.999), 1.0e-8), (Float32[0.0 0.0; 0.0 0.0;

After this, PolicyNetwork(2, 3, 4) |> gpu will move all the parameters, so your second definition should not be needed.

Note also that adding type parameters might be a good idea, for performance:

struct PolicyNetwork{A,B,C,D}
    hidden1::A
    hidden2::B
    mu::C
    std::D
end

Topic		Replies	Views
Flux custom model - feedback of the output layer to the input layer New to Julia package , flux	10	2271	May 4, 2021
Network Not Updating, Flux Julia Machine Learning	6	418	September 19, 2022
[Solved]Flux.jl: can it be used to train networks with multiple distinct inputs? Machine Learning	1	1267	October 22, 2018
I need help with getting a simple Deep QNetwork to learn New to Julia question , flux , neural-network	2	485	January 20, 2023
Maliar, Maliar, and Winant using Flux.jl (I just want to write a custom objective) Machine Learning question , flux , zygote	8	764	January 19, 2024

How to use Flux to write a Multi-Head Output Network

Related topics