I am currently trying to implement an LSTM model for a regression exercise using the flux.jl library. Although it is rather straightforward to build-up the model, I have rather some issues in understanding the right shape of the input arrays to train the model as an error related to the dimensions of the arrays is thrown, so I am not quite sure whether I feed the right input shape to the model, etc… Does somebody have any clues why this is happening?
Here’s the code/error to reproduce the issue for some random data (20 samples, 6 input variables, 1 target variable, sequence length is 100):
using Flux
#Create training and validation sets
x_train, y_train= [rand(6, 100) for i in 1:20], [rand(1, 100) for i in 1:20]
x_valid, y_valid = [rand(6, 100) for i in 1:20], [rand(1, 100) for i in 1:20]#Define loss function
function mseLoss(x, y)
loss = Flux.mse(model(x), y)
Flux.reset!(model)
return loss
end#Create initial model
model= Chain(
LSTM(6, 20),
LSTM(20, 20),
LSTM(20, 20),
Dense(20, 1))#Train model
evalcb = () → @show mseLoss(x_valid, y_valid)
Flux.train!(mseLoss, params(model), zip(x_train, y_train), Flux.ADAM(0.01), cb = Flux.throttle(evalcb, 30))
ERROR: DimensionMismatch(“matrix A has dimensions (80,6), vector B has length 20”)
However, if I remove the call back function from the training routine, there is no error being thrown.
Flux.train!(mseLoss, params(model), zip(x_train, y_train), Flux.ADAM(0.01))