I am trying to create an RNN to predict the Google stock price based on the opening price for each day.
I am following a tutorial where they do it in Tensorflow, and I am trying to do the same in Flux.
My training data is a sequence of 1258 stock prices with the opening price. I want to use the trailing 60 days of data to predict for the next day. I have the training data as a Vector of 60 Matrixs’ of 1 x 1198. Is that right? Sequence x features x batch size?
Here is how they built the RNN in Keras.
# Initialising the RNN
regressor = Sequential()
# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))
# Adding the output layer
regressor.add(Dense(units = 1))
# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')
And here is what I came up with in Flux:
lstm_layer_1 = Flux.LSTM(1 => 50)
dropout_layer_1 = Flux.Dropout(0.2)
lstm_layer_2 = Flux.LSTM(50 => 50)
dropout_layer_2 = Flux.Dropout(0.2)
lstm_layer_3 = Flux.LSTM(50 => 50)
dropout_layer_3 = Flux.Dropout(0.2)
lstm_layer_4 = Flux.LSTM(50 => 50)
dropout_layer_4 = Flux.Dropout(0.2)
output_layer = Flux.Dense(50 => 1)
model = Chain(lstm_layer_1,
dropout_layer_1,
lstm_layer_2,
dropout_layer_2,
lstm_layer_3,
dropout_layer_3,
lstm_layer_4,
dropout_layer_4,
output_layer)
Reading the Keras documentation for LSTM, the return_sequences determines if you return the entire sequence or just the last value. The fourth lstm layer needs to have return_sequences back to false.
Would adding something like x → x[end] after lstm_layer_4 give us the last value to replicate that functionality? Something else?
Thanks for your help!