How can I feed sequences of different lengths though a RNN and still have it be differentiable in Flux? (Minimal example included)

DevJac · April 1, 2021, 3:05am

I have several matrices of several different lengths. I want to feed these matrices though a RNN using Flux, however, I haven’t been able to do so. Everything I’ve tried either uses a lot of memory during the backward pass, or gives an error when during the backward pass. The example below works for this small example, but my 16 gigs of ram fills up fast in my real use case, which has a 100 length input vector and a 30 length output vector.

Am I doing this right?

using Flux
using Zygote

function embed(encodings::Vector{Tuple{Int,Vector{Vector{Float32}}}}, embedder)
    embeddings = Zygote.Buffer(encodings, Tuple{Int,Vector{Float32}})
    for encoding_length in 1:maximum(e -> length(e[2]), encodings)
        of_length = filter(e -> length(e[2]) == encoding_length, encodings)
        if isempty(of_length); continue end
        Flux.reset!(embedder)
        local embedded
        for i in 1:encoding_length
            embedded = embedder(reduce(hcat, map(e -> e[2][i], of_length)))
            Zygote.@ignore @assert size(embedded, 2) == length(of_length)
        end
        for (i_embedded, i0) in enumerate(map(e -> e[1], of_length))
            embeddings[i0] = (i0, embedded[:,i_embedded])
        end
    end
    embeddings_vector = copy(embeddings)
    Zygote.@ignore @assert embeddings_vector == sort(embeddings_vector)
    embeddings_vector
end

function embed_basic(encodings::Vector{Tuple{Int,Vector{Vector{Float32}}}}, embedder)
    embeddings = Zygote.Buffer(encodings, Tuple{Int,Vector{Float32}})
    for (i, e) in encodings
        embeddings[i] = (i, embedder.(e)[end])
    end
    embeddings_vector = copy(embeddings)
    Zygote.@ignore @assert embeddings_vector == sort(embeddings_vector)
    embeddings_vector
end

function demo()
    nn = LSTM(2, 3)
    encodings = [
        (1, [[1f0, 2], [3f0, 4], [5f0, 6]]),
        (2, [[1f0, 2], [3f0, 4], [5f0, 6]]),
        (3, [[1f0, 2], [3f0, 4]]),
        (4, [[1f0, 2], [3f0, 4]]),
        (5, [[1f0, 2], [3f0, 4]])]
    @show embed_basic(encodings, nn)
    g = gradient(params(nn)) do
        e1 = embed_basic(encodings, nn)
        e2 = reduce(hcat, map(x -> x[2], e1))
        @show e2
        sum(e2)
    end
end

demo()

Note embed_basic works, but uses too much ram. embed gives an error during the backward pass, but is faster during the forward pass. The error it gives is: ERROR: LoadError: DimensionMismatch("cannot broadcast array to have fewer dimensions")

(Also, I may not understand what “embed” means. I think it’s like an encoding, but created automatically by a neural network. They’re just variable names, of course. If they’re named poorly, the compiler don’t care.)

Topic		Replies	Views
Variable sequence length RNN in Flux Machine Learning flux	2	449	November 3, 2021
Is it possible to use batching in Flux for sequences of different lengths? Machine Learning	3	1409	February 19, 2020
Evaluate variable-length input array RNN in Julia Flux General Usage question , flux , rnn	4	486	June 20, 2022
Parallel feedforward computation with RNN Machine Learning flux	0	422	July 15, 2019
Flux GPU Error with Zygote Machine Learning flux , rnn	2	479	August 23, 2022

How can I feed sequences of different lengths though a RNN and still have it be differentiable in Flux? (Minimal example included)

Related topics