RNN (LSTM) Training issues with loss function

jparcinski · January 17, 2021, 12:11pm

Hi, I have problem with understanding 1 issue, maybe someone will help me.

I’m trying to build LSTM or simple RNN for predicting price values. To start understand how to do it in Julia I used the example from here: A Basic RNN. I copied the code and got the same results, as in the example.

Later I tried to switch the example data to my own dataset with price values (using same format of arrays etc!). Unfortunately whatever hypermarameters I set, the same problem kept happening. NN seemed to work fine and succesfully could predict values (in some cases - mostly RNN instead of LSTM), however loss function during training was returning multiple values (2-83) instead of 1 like in the example.

┌ Info: Epoch 1
└ @ Main C:\Users\japa.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121

sum(loss.(hist_test, target_test)) = 5195.030318554835
sum(loss.(hist_test, target_test)) = 14109.728906001292

┌ Info: Epoch 2
└ @ Main C:\Users\japa.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121

sum(loss.(hist_test, target_test)) = 4988.243513415623
sum(loss.(hist_test, target_test)) = 9636.505577573585.

Each learning iteration was returning multiple loss value. Depending on the model and other parameters (I think), it was either 2 (like here) or even 80 sometimes. Can someone please explain to me why it is happening? When I just execute loss function, before training, I get 1 result always:

println("Training loss before = ", sum(loss.(hist_train, target_train)))
Training loss before = 293785.2022681254

My code:
#FUNCTIONS

#Model 1
simple_rnn = Flux.RNN(1, 1, (x → x))

#Model 2 - same results. Here I executed code with Model 1.
simple_rnn = Chain(LSTM(memory,8), Dense(8,1))

function eval_model(x)
out = simple_rnn.(x)[end]
Flux.reset!(simple_rnn)
out
end

#PARAMETERS
loss(x, y) = abs(sum((eval_model(x) .- y)))
ps = Flux.params(simple_rnn)
opt = Flux.ADAM()

#TRAINING
evalcb() = @show(sum(loss.(hist_test, target_test)))
@epochs num_epochs Flux.train!(loss, ps, zip(hist_train, target_train), opt, cb = Flux.throttle(evalcb, 1))

Input data format:

Hist_train:
1000-element Array{Array{Float64,1},1}:
[228.925, 229.072, 229.367, 229.203, 229.28, 229.399, 229.438]
[229.072, 229.367, 229.203, 229.28, 229.399, 229.438, 229.732]
[229.367, 229.203, 229.28, 229.399, 229.438, 229.732, 230.634]
[229.203, 229.28, 229.399, 229.438, 229.732, 230.634, 231.369]
[229.28, 229.399, 229.438, 229.732, 230.634, 231.369, 231.761]
[229.399, 229.438, 229.732, 230.634, 231.369, 231.761, 232.06]
[229.438, 229.732, 230.634, 231.369, 231.761, 232.06, 231.605]
[229.732, 230.634, 231.369, 231.761, 232.06, 231.605, 231.998]

target_train:
1000-element Array{Float64,1}:
227.233
226.61
226.373
226.538
226.688
226.37
226.276
226.405
226.42
226.891

So to sum up: I have the same model, the exact same code. I’m using the same format of input data. I even tried to set exactly 7 memory values (like in the example). And yet when I execute my code it behaves differently. Please help me, because I don’t know what is happening, maybe i missed something. Thanks!

Topic		Replies	Views
RNN model converges at a high training loss Machine Learning question , flux	0	345	April 24, 2022
Loss function for sequence modeling w/ RNN Machine Learning flux	1	512	July 10, 2020
Training a LSTM model for time series, lack of performance Performance lux	9	645	October 13, 2023
RNN Not learning Machine Learning	9	1251	January 4, 2021
Am I using LSTMs wrong? Machine Learning flux , lstm	2	599	October 25, 2021

RNN (LSTM) Training issues with loss function

Related topics