Batch training for LSTMs in Flux or Knet

danielw2904 · October 5, 2020, 6:06am

I’m new to Flux so apologies if this is a dumb question/ has already been answered (I cannot figure it out based on a search here and the model zoo)

I am trying to train an LSTM on a sequence of choices where more than one thing can be chosen per time period. Let’s say we have 3 periods and 5 possible choices. The “multi-hot” representation would look like this:

X₁ = [[0;1;0;0;1], [1;0;0;0;0], [0;0;1;1;0]]

where the choices at time t are

X₁[t]

I then create my labels by using the choices of the next period:

y₁ = X₁[2:end]
X₁ = X₁[1:end-1]

And run the following model:

data = Flux.Data.DataLoader((X₁, y₁))
m = Chain(LSTM(5,5), softmax)
loss(x, y) = sum(logitcrossentropy.(m.(x), y))
loss(X₁, y₁)
opt = ADAM()
ps = Flux.params(m)
for _ in 1:10000
    Flux.train!(loss, ps, data, opt) 
end

Which seems to work:

julia> reset!(m)
julia> m(X₁[2])
5-element Array{Float32,1}:
 0.07553877
 0.19324568
 0.31979644
 0.3028726
 0.10854646

My questions are:
a) Does this seem correct?
b) When does the hidden state have to be reset?
c) How can I extend this example to batch training?

If this is more easily done in Knet I will try that.

Thanks! Any help is appreciated!

dhairyagandhi96 · October 5, 2020, 5:54pm

You typically want to reset after calculating the loss since you don’t want to accumulate the gradients beyond one optimisation step.

I would be confused about how the labels are encoded, but I guess it’s fine to do it this way.

I guess the question is what you’re looking to do with it?

Also here is a simpler lstm example

danielw2904 · October 5, 2020, 6:07pm

Thank you that helps a lot! So basically the loss in the example runs the model on one batch of sequences and then resets the hidden state. Am I getting this right?

danielw2904 · October 5, 2020, 6:57pm

Oh I’m confused alright. but what would be the alternative?

denizyuret · October 18, 2020, 7:02pm

Knet has some RNN examples in the tutorial: https://github.com/denizyuret/Knet.jl/tree/master/tutorial

danielw2904 · October 18, 2020, 7:57pm

Thank you. I will take a look!

Topic		Replies	Views
Simple Flux LSTM for Time Series Machine Learning question , flux , time-series , machine-learning	62	13543	April 11, 2022
Problem with LSTM and GRU Layers in Flux New to Julia flux , machine-learning	9	680	February 14, 2024
Building simple sequence-to-one RNN with Flux New to Julia flux	8	2078	March 4, 2021
Flux LSTM format of input train data General Usage question , data , flux	0	464	April 7, 2020
Am I using LSTMs wrong? Machine Learning flux , lstm	2	599	October 25, 2021

Batch training for LSTMs in Flux or Knet

Related topics