Problem on model and gradient descend in Flux

bertschi · October 22, 2024, 5:47pm

Don’t think that will work as expected:

While you can pass functions to Chain they will be opaque, i.e., Flux cannot see inside to get parameters. Further, your function x -> Dense(5, 5) never calls the dense layer!
Simply use model1 = Dense(5 => 5, relu) or Chain(Dense(5 => 5), relu) instead.

You can combine all your model parts into a single model:

model = Chain(Dense(4 => 5, relu), # your model1
              Parallel(tuple, # combine both model outputs into tuple
                       Dense(5 => 6),   # model2
                       Dense(5 => 7)))  # model3

# Use as follows ... note that I have changed the dimensions to better understand where each value is coming from
batch = rand(4, 8)
size.(model(batch))  # will be ((6, 8), (7, 8))

gradient(model) do m
    m2, m3 = m(batch)
    loss1 = m2 .- trueX
    loss2 = m3 .- trueY
    sum(vcat(loss1, loss2))
end

Topic		Replies	Views
Splitting and joining Flux model chains Machine Learning question , flux	4	2271	December 1, 2023
Training layers of a Flux model separately Machine Learning question	1	380	November 13, 2021
Flux: multiple input of unequal dimensions Machine Learning flux	4	1304	September 7, 2020
Stacking layers example Flux - Flux.params empty New to Julia question	2	833	December 12, 2019
Found Bug in Flux General Usage question , package , bug , flux	13	1181	July 11, 2022

Problem on model and gradient descend in Flux

Related topics