I have built a Time series forecasting model in Flux.jl including an LSTM layer (the problem is the same also when I include a GRU layer). I am able to train the model without errors. However, when I try to run: model(val_samples), or model(test_samples), the model does not return a vector of forecasted targets. Instead, model(train_samples) works just fine.
#My model is the following:
model = Chain(
Flux.flatten,
LSTM(3,32),
Dense(32,32,relu),
Dense(32,1)
)
#The data I have is:
julia> size(train_samples)
(3, 1, 7642)
julia> size(val_samples)
(3, 1, 955)
julia> size(test_samples)
(3, 1, 955)
#And the labels are:
julia> size(train_targets)
(1, 7642)
....
#I run the model with:
ps = Flux.params(model)
opt = Flux.RMSProp()
loss(x,y) = Flux.Losses.mae(model(x),y)
epochs = 300
loss_history = []
for epoch in 1:epochs
Flux.train!(loss, ps, [(train_samples, train_targets)], opt)
train_loss = loss(train_samples, train_targets)
push!(loss_history, train_loss)
println("Epoch = $epoch Training Loss = $train_loss")
end
#I correctly get results for the training data:
julia> model(train_samples)
1Ă—7642 Matrix{Float32}:
0.0961234 0.0967543 0.0972749 0.0951103 0.0937003 0.091914 0.0913435 … 38.6692 43.5427 43.0876 43.2159 43.4824 43.612 43.5726 43.5831
The error I get, after having successfully trained the model, is the following:
julia> model(val_samples)
ERROR: DimensionMismatch: array could not be broadcast to match destination
Stacktrace:
[1] check_broadcast_shape
@ .\broadcast.jl:553 [inlined]
[2] check_broadcast_shape
@ .\broadcast.jl:554 [inlined]
[3] check_broadcast_axes
@ .\broadcast.jl:556 [inlined]
[4] instantiate
@ .\broadcast.jl:297 [inlined]
[5] materialize!
@ .\broadcast.jl:884 [inlined]
[6] materialize!
@ .\broadcast.jl:881 [inlined]
[7] muladd(A::Matrix{Float32}, B::Matrix{Float32}, z::Matrix{Float32})
@ LinearAlgebra C:\Users\User\AppData\Local\Programs\Julia-1.9.3\share\julia\stdlib\v1.9\LinearAlgebra\src\matmul.jl:249
[8] (::Flux.LSTMCell{Matrix{Float32}, Matrix{Float32}, Vector{Float32}, Tuple{Matrix{Float32}, Matrix{Float32}}})(::Tuple{Matrix{Float32}, Matrix{Float32}}, x::Matrix{Float64})
@ Flux C:\Users\User\.julia\packages\Flux\ljuc2\src\layers\recurrent.jl:314
[9] Recur
@ C:\Users\User\.julia\packages\Flux\ljuc2\src\layers\recurrent.jl:134 [inlined]
[10] macro expansion
@ C:\Users\User\.julia\packages\Flux\ljuc2\src\layers\basic.jl:53 [inlined]
[11] _applychain(layers::Tuple{typeof(Flux.flatten), Flux.Recur{Flux.LSTMCell{Matrix{Float32}, Matrix{Float32}, Vector{Float32}, Tuple{Matrix{Float32}, Matrix{Float32}}}, Tuple{Matrix{Float32}, Matrix{Float32}}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}, x::Array{Float64, 3})
@ Flux C:\Users\User\.julia\packages\Flux\ljuc2\src\layers\basic.jl:53
[12] (::Chain{Tuple{typeof(Flux.flatten), Flux.Recur{Flux.LSTMCell{Matrix{Float32}, Matrix{Float32}, Vector{Float32}, Tuple{Matrix{Float32}, Matrix{Float32}}}, Tuple{Matrix{Float32}, Matrix{Float32}}}, Dense{typeof(relu), Matrix{Float32}, Vector{Float32}}, Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}})(x::Array{Float64, 3})
@ Flux C:\Users\User\.julia\packages\Flux\ljuc2\src\layers\basic.jl:51
[13] top-level scope
@ REPL[134]:1
Applying the identical data structures to a model where I do not include a RNN cell, but only Dense Layers, does not lead to the problem, as I am able to get forecasts for the training samples, the validation and the testing sample without problems.
model = Chain(
Flux.flatten,
Dense(3,32,relu),
Dense(32,32,relu),
Dense(32,1)
)
I don’t understand where the dimension mismatch originates from. I was thinking I might have wrongly defined the RNN chain so that the dimension of the train_sample matter for the future, but I don’t understand how this is possible.
Thank you in advance for any response to this.