Why are flux models declared in top-level scope?

From what I understand, one of the big things to avoid in julia is to have stuff be in the global scope (at least unless they are declared constant).

In addition I don’t understand how i’m supposed to have several “instances” of the same model.
e.g. I want to build an encoder decoder RNN, and I have the following code layout inspired by the Flux docs:

@with_kw struct EncoderDecoder
    encoder = LSTM(2,10)
    decoder = Chain(LSTM(3,10),LSTM(10,10),Dense(10,3))
end
Flux.@functor(EncoderDecoder)
model = EncoderDecoder()

function loss(enc_in,ys)
    encoder = model.encoder
    decoder = model.decoder
    [loss stuff...]
    return loss
end

function predict(enc_in)
    encoder = model.encoder
    decoder = model.decoder
    [predict stuff...]
end

I did get the model to train and predict with this code, but say I now want two models model1::EncoderDecoder and model2::EncoderDecoder so I can train them with different learning parameters and then compare their performance. This would be impossible with the code above as all the relevant functions use the model that was declared in the beginning. Am I missing something?

2 Likes

you should take a look at some of the model zoo examples, e.g. https://github.com/FluxML/model-zoo/blob/master/vision/mlp_mnist/mlp_mnist.jl

1 Like

I have looked at them, and frankly they’ve just confused me even more. The example doesn’t even do what the docs does where they avoid explicitly passing the model into the loss function.

For these specific examples the expensive computations are hidden behind functions so the small performance impact of globals will be negligible. But I agree that it doesn’t look too nice.

3 Likes

I’m not sure there’s much of a reason outside of brevity and author convenience for the model zoo. This has been reported before in https://github.com/FluxML/Flux.jl/pull/1085, and the takeaway was that it doesn’t make much of a difference for what most people are doing. That said, I personally prefer avoiding closures where possible and passing the model into the loss function. Feel free to shoot some PRs off to the model zoo for examples you think should be changed and we should be able to get em merged quickly.

1 Like