Is it possible to use batching in Flux for sequences of different lengths?

jmeeks29ig · February 13, 2020, 5:58pm

I have sequences that are of the dimensions 8 by N, where N is a number fluctuating between 1 and 400. I would like to be able to do batch processing on these sequences, but have not had any luck in doing so inside of Flux. Is it possible to do batch operations in Flux on sequences of varying lengths?

DrChainsaw · February 14, 2020, 8:36pm

Not sure what you are after, but iirc the way other frameworks which use 3D inputs to RNNs handle this is by padding the input and then masking the output.

I think something like this would be the equivalent in Flux:

julia> rnn = Flux.RNN(4,2);

julia> x0 = randn(Float32, 4, 3);

julia> rnn(x0)
2×3 Array{Float32,2}:
 0.784038   0.424429  0.904399
 0.818972  -0.459738  0.78797

julia> x1 = hcat(randn(Float32, 4, 2), zeros(Float32, 4, 1));

julia> mask = [1 1 0]
1×3 Array{Int64,2}:
 1  1  0

julia> rnn(x1) .* mask
2×3 Array{Float32,2}:
  0.981415   0.165332  0.0
 -0.737131  -0.215283  0.0

I guess it might also be possible to just remove the missing data from the state:

julia> x2 = randn(Float32, 4, 2);

julia> rnn.state = rnn.state[:,1:2]
2×2 Array{Float32,2}:
  0.981415   0.165332
 -0.737131  -0.215283

julia> rnn(x2)
2×2 Array{Float32,2}:
 -0.773747  0.974077
 -0.849829  0.975408

Depending on what your whole model looks like, this may be a bit cumbersome to do…

jmeeks29ig · February 14, 2020, 9:07pm

Thanks! After asking the question I actually thought padding might be a way to go, but I hadn’t thought of how to implement it yet - I am still new to Julia and have a lot to learn in terms of optimization, so I really appreciate the example.

jeremiedb · February 19, 2020, 12:32am

Also new to Flux, but from my understanding, the management of variable sequence length is handled naturally by feeding the RNN cell with data shaped as a vector of size = sequence_length and whose elements are of size (features, batch_size). Therefore, handling of various sequence length shouldn’t require any padding trick, but just to broadcast the RNN over a vector of varying sequence length.

In the example below, the toy data has batch_size = 2, and both samples have identical data.
Then, a sequence of length 4 is built, again by duplicating the input features.

It can be observed that applying model m to the single (3,2) input, the RNN produces identical outputs for each of the two observations. That would not be the case if the (3,2) input was instead referring to a sequence of 2 of a single observation.

Broadcasting m onto the vector that repeats the (3,2) input 4 times, we get as an output a vector of length 4 containing (3,2) elements. It can be seen that the first (3,2) element matches with m(x1). Also, The second (3,2) matches the output of m(x1) if m(x1) is applied twice, which is coherent with the Flux recurrent cell taking one sequence element at a time.

The docs also illustrates this here: Model Reference · Flux

using Flux
using Random: seed!
seed!(1234);
m = RNN(3, 3)
# 3 features X 2 samples
seed!(1234);
x1 = rand(3,1)
x1 = cat(x1,x1, dims=2)
m1 = m(x1)
println(m1)

[-0.8223893847165972 -0.8223893847165972; 0.7631164681754794 0.7631164681754794; -0.49397430034214257 -0.49397430034214257]

m1 = m(x1)
println(m1)

[-0.9619506549502201 -0.9619506549502201; 0.9836595589647671 0.9836595589647671; 0.09047523145543104 0.09047523145543104]

# Now broadcasting over a sequence
seed!(1234);
m = RNN(3, 3)
# 3 features X 2 samples X 4 timesteps
x2 = [x1 for i in 1:4]
m2 = m.(x2)
println(m2[1])

[-0.8223893847165972 -0.8223893847165972; 0.7631164681754794 0.7631164681754794; -0.49397430034214257 -0.49397430034214257]

println(m2[2])

[-0.9619506549502201 -0.9619506549502201; 0.9836595589647671 0.9836595589647671; 0.09047523145543104 0.09047523145543104]

CUDNN supports a fused LSTM where the full timesteps are treated as a block with a 3D array as input, but I’m not aware if this approach is accessible in Flux.

Topic		Replies	Views
Variable sequence length RNN in Flux Machine Learning flux	2	454	November 3, 2021
Building simple sequence-to-one RNN with Flux New to Julia flux	8	2078	March 4, 2021
Evaluate variable-length input array RNN in Julia Flux General Usage question , flux , rnn	4	492	June 20, 2022
Dimensions of minibatch Machine Learning	3	1061	August 7, 2020
How to do batching in Flux's recurrent sequence model to take advantage of GPU during training? Machine Learning flux	1	819	September 12, 2019

Is it possible to use batching in Flux for sequences of different lengths?

Related topics