Uploading vector of vectors to GPU in flux.jl

mys_721tx · September 21, 2021, 9:27pm

Hello all,

I am training a many-to-many RNN. My input and output data are both Vector{Vector{Vector{Float32}}}. The jld file of the input and output is about 3GB. I checked the model-zoo and did not find anything example on uploading RNN sequences to the GPU. I tried to upload the whole thing by using data_train = zip(us_train, ys_train) |> gpu. This ends up using all my GPU memory. Should I convert the input to a tensor of some sort? What is the best way to upload those sequence data to the GPU?

ToucheSir · September 22, 2021, 4:13pm

The best way to do so is in minibatches, just like you would for any other data or framework. Unlike other layers, however, this will be a vector of arrays to represent the sequence dimension. In other words, each minibatch will look like [(features x batch size) x sequence length]. Recurrence · Flux talks all about what kinds of data the built-in RNNs can work with.

mys_721tx · September 23, 2021, 4:39am

I have played around with DataLoader and it takes a [features x sequence length x batch size] tensor.

The following snippet can run but the loss is way off. I will keep playing around and see what’s wrong.

train_loader = Flux.DataLoader((data=us_train, label=ys_train), batchsize=5)

for (x, y) in train_loader
    x = x |> gpu
    y = y |> gpu

    Flux.train!(loss, ps, zip(x, y), opt, cb=evalcb)
end

ToucheSir · September 23, 2021, 2:48pm

x and y are already batched, so zipping them feeds through samples 1 at a time. Flux.train! is not really flexible enough for this, so I’d recommend using a custom training loop so that you’re able to transform the data into an RNN-compatible from before feeding it to the model.

mys_721tx · October 1, 2021, 5:55pm

This is the training function I ended up having.

function seq_batch_train!(loss, ps, data, opt; cb = () -> ())
    local training_loss
    cb = Flux.Optimise.runall(cb)

    x, y = data
    x = x |> gpu
    y = y |> gpu

    gs = Flux.gradient(ps) do
        training_loss = loss(x, y)
        return training_loss
    end
    @show(training_loss)
    Flux.Optimise.update!(opt, ps, gs)
    cb()
end

And here is the training loop with epoches.

for epoch in 1:1
    @show(epoch)
    for batch in train_loader
        seq_batch_train!(loss, ps, batch, opt, cb = evalcb)
    end
    @save "model_$(now())_epoch-$epoch.bson" m opt
end

The trick is in the loss function:

function loss(x, y)
    Flux.reset!(m)
    y_pred = [(m(x[:, xi, :]) - y[:, xi, :]).^2 for xi in axes(x, 2)] |> sum |> sum
    y_pred / length(x)
end

hcat() is very slow on GPU and therefore the MSE is manually calculated instead of using mse() from Flux.

Currently batch size of 10 uses about 8GB of GPU memory. I can see some tweaks to get it work with longer time series.

Topic		Replies	Views
How to do batching in Flux's recurrent sequence model to take advantage of GPU during training? Machine Learning flux	1	819	September 12, 2019
GPU performance and switching from tabular to recurrent data format for Flux.jl Machine Learning gpu , flux , rnn	5	605	August 1, 2022
How to arrange data for time series forecasting (mini-batching) without violoating the GPU memory for a LSTM? General Usage flux	7	1344	April 20, 2021
Building simple sequence-to-one RNN with Flux New to Julia flux	8	2078	March 4, 2021
Flux: Hard to use train! and DataLoader for minibatched NamedTuple dataset with GPU Machine Learning flux	2	1434	September 24, 2020

Uploading vector of vectors to GPU in flux.jl

Related topics