Flux.Chain vs expand everything in a function

DiracFermion1 · August 25, 2021, 2:39pm

Hi, I was trying to understand what the difference is between Flux.Chain vs expanding all operations in a single function.

For instance, given following object:

Chain(Dense(d_in, d_hidden, tanh), Dense(d_hidden, d_hidden, tanh), Dense(d_hidden, d_in,tanh))

if I replace it with

function f(x)
    out = Dense(d_in, d_hidden, tanh)(x)
    out = Dense(d_hidden, d_hidden, tanh)(out)
    out = Dense(d_hidden, d_in,tanh)(out)
end

Does anyone know whether these two are exactly the same in terms of training/testing behaviors?

Thank you!

ToucheSir · August 25, 2021, 2:59pm

They are not exactly the same, because the second example immediately discards the 3 Dense layers after f returns. If you constructed them beforehand, passed them into the function and passed their parameters when taking a gradient, then the behaviour will be the same. You’ll also have re-invented most of the functionality of Chain!

Topic		Replies	Views
Chain multiple Chains in Flux General Usage question , flux	2	699	August 11, 2020
Efficient way for executing multiple models Performance flux	0	139	February 26, 2024
Splitting and joining Flux model chains Machine Learning question , flux	4	2270	December 1, 2023
Possible anonymous function bug in Flux Machine Learning flux	9	672	April 14, 2020
Problem on model and gradient descend in Flux General Usage	18	191	October 27, 2024

Flux.Chain vs expand everything in a function

Related topics