Sequence-to-one modelling and Flux.reset!

JLDC · November 2, 2021, 3:57pm

I’m trying to build a sequence-to-one model using Flux and I’m running into an error that I have not been able to solve.

Following: Building simple sequence-to-one RNN with Flux - New to Julia - JuliaLang it seems that it wasn’t an issue at that time but I’m not sure since I didn’t try it out back then.

The problem is that when I try to use Flux.reset! I get an error message.

Here’s a working example to replicate the error:

using Flux
x = [rand(Float32, 2, 32) for _ ∈ 1:10]
y = rand(Float32, 1, 32)

mutable struct MyModel
    rnn
    fc
end
Flux.@functor MyModel

function (m::MyModel)(x)
    Flux.reset!(m.rnn) # THIS IS THE PROBLEMATIC LINE, IF REMOVED THE CODE RUNS FINE
    [m.rnn(x[i]) for i ∈ 1:length(x)-1]
    m.fc(m.rnn(x[end]))
end
m = MyModel(RNN(2, 5), Dense(5, 1))

loss(x, y) = Flux.mse(m(x), y)
opt = Descent(1e-2)
ps = Flux.params(m)
# This works
loss(x, y)
# This doesn't work
Flux.train!(loss, ps, [(x, y)], opt)

Strangely enough, loss(x, y) works just fine but when used with Flux.train! it throws an error.

ERROR: LoadError: DimensionMismatch("new dimensions (5, 1) must be consistent with array size 160")

Am I doing something wrong in the way I’m building my model or is this a bug?

Thanks in advance for your help!

ToucheSir · November 2, 2021, 5:37pm

Without diagnosing the specific issue, I would strongly recommend calling reset! outside of your loss function. That unfortunately means ditching train!, but IME train! doesn’t really work for RNNs in the first place because of the assumptions it makes around input batching.

JLDC · November 2, 2021, 10:42pm

Thank you so much, that does indeed solve the issue!

Topic		Replies	Views
Problem with LSTM and GRU Layers in Flux New to Julia flux , machine-learning	9	679	February 14, 2024
Building simple sequence-to-one RNN with Flux New to Julia flux	8	2077	March 4, 2021
Flux new explicit API not work but old implicit API works for a simple RNN Machine Learning flux	0	251	September 26, 2023
Errors with Flux RNN set Machine Learning question , flux , machine-learning	1	462	April 9, 2022
RNN is not trained Machine Learning	6	616	April 22, 2021

Sequence-to-one modelling and Flux.reset!

Related topics