I’m trying to build a sequence-to-one model using Flux and I’m running into an error that I have not been able to solve.
Following: Building simple sequence-to-one RNN with Flux - New to Julia - JuliaLang it seems that it wasn’t an issue at that time but I’m not sure since I didn’t try it out back then.
The problem is that when I try to use Flux.reset!
I get an error message.
Here’s a working example to replicate the error:
using Flux
x = [rand(Float32, 2, 32) for _ ∈ 1:10]
y = rand(Float32, 1, 32)
mutable struct MyModel
rnn
fc
end
Flux.@functor MyModel
function (m::MyModel)(x)
Flux.reset!(m.rnn) # THIS IS THE PROBLEMATIC LINE, IF REMOVED THE CODE RUNS FINE
[m.rnn(x[i]) for i ∈ 1:length(x)-1]
m.fc(m.rnn(x[end]))
end
m = MyModel(RNN(2, 5), Dense(5, 1))
loss(x, y) = Flux.mse(m(x), y)
opt = Descent(1e-2)
ps = Flux.params(m)
# This works
loss(x, y)
# This doesn't work
Flux.train!(loss, ps, [(x, y)], opt)
Strangely enough, loss(x, y)
works just fine but when used with Flux.train!
it throws an error.
ERROR: LoadError: DimensionMismatch("new dimensions (5, 1) must be consistent with array size 160")
Am I doing something wrong in the way I’m building my model or is this a bug?
Thanks in advance for your help!