Why are flux models declared in top-level scope?

htmj · September 29, 2021, 6:04pm

From what I understand, one of the big things to avoid in julia is to have stuff be in the global scope (at least unless they are declared constant).

In addition I don’t understand how i’m supposed to have several “instances” of the same model.
e.g. I want to build an encoder decoder RNN, and I have the following code layout inspired by the Flux docs:

@with_kw struct EncoderDecoder
    encoder = LSTM(2,10)
    decoder = Chain(LSTM(3,10),LSTM(10,10),Dense(10,3))
end
Flux.@functor(EncoderDecoder)
model = EncoderDecoder()

function loss(enc_in,ys)
    encoder = model.encoder
    decoder = model.decoder
    [loss stuff...]
    return loss
end

function predict(enc_in)
    encoder = model.encoder
    decoder = model.decoder
    [predict stuff...]
end

I did get the model to train and predict with this code, but say I now want two models model1::EncoderDecoder and model2::EncoderDecoder so I can train them with different learning parameters and then compare their performance. This would be impossible with the code above as all the relevant functions use the model that was declared in the beginning. Am I missing something?

CarloLucibello · September 29, 2021, 6:45pm

you should take a look at some of the model zoo examples, e.g. https://github.com/FluxML/model-zoo/blob/master/vision/mlp_mnist/mlp_mnist.jl

htmj · September 29, 2021, 7:08pm

I have looked at them, and frankly they’ve just confused me even more. The example doesn’t even do what the docs does where they avoid explicitly passing the model into the loss function.

kristoffer.carlsson · September 29, 2021, 7:28pm

For these specific examples the expensive computations are hidden behind functions so the small performance impact of globals will be negligible. But I agree that it doesn’t look too nice.

ToucheSir · October 1, 2021, 8:48pm

I’m not sure there’s much of a reason outside of brevity and author convenience for the model zoo. This has been reported before in https://github.com/FluxML/Flux.jl/pull/1085, and the takeaway was that it doesn’t make much of a difference for what most people are doing. That said, I personally prefer avoiding closures where possible and passing the model into the loss function. Feel free to shoot some PRs off to the model zoo for examples you think should be changed and we should be able to get em merged quickly.

Topic		Replies	Views
How to define a data-dependent model without using global variables Machine Learning question , diffeq , flux , sciml	4	493	December 25, 2021
Train flux struct with list of models General Usage flux	8	353	March 31, 2023
Training gets differents results when using Flux.train() inside function Machine Learning flux	2	1337	August 23, 2020
Problem on model and gradient descend in Flux General Usage	18	191	October 27, 2024
Writing complex Flux Models Machine Learning	4	1702	June 5, 2020

Why are flux models declared in top-level scope?

Related topics