Hi,
I would like to know how I have to arrange data for time series forecasting (mini-batching) without violoating the GPU memory for a LSTM. I tried to follow the instructions of @jeremiedb of the following link:
https://github.com/FluxML/Flux.jl/issues/1360#issuecomment-734028532
Here is my code:
# load flux module
using Flux
# gpu or cpu flag
gpu_or_cpu = cpu
# ini variables
seq_len = 50
N = 82100
batch_size = 82050
no_features = 6
no_labels = 1
# mini-batching for time series
# X = rand(N, no_features)
# Y = rand(N, no_labels)
# X_tr, Y_tr = prepare_time_series_data_4_flux(X_train, Y_train, 50, 1)
# function prepare_time_series_data_4_flux(X, Y, seq_len)
# N = size(X, 1)
# num_batches = Int(floor((N-seq_len)))
# X_batched = Vector{Array{Float32}}(undef, seq_len)
# Y_batched = Float32.(Y[seq_len+1:end,:]')
# for i in 1:seq_len
# X_batched[i] = hcat([Float32.(X[i+j, :]) for j in 0:num_batches-1]...)
# end
# return X_batched, Y_batched
# end
# short cut instead of mini batching
X_tr = [rand(Float32, 6, batch_size) for i in 1:seq_len]
Y_tr = rand(Float32, 1,50)
# convert to cpu or gpu
X_tr = X_tr |> gpu_or_cpu
Y_tr = Y_tr |> gpu_or_cpu
data = (X_tr, Y_tr)
# select optimizer
opt = ADAM(0.001, (0.9, 0.999))
y_model = model.(X_tr)
# definition of the loss function
function loss(X,Y)
Flux.reset!(model)
mse_val = sum(abs2.(Y.-Flux.stack(model.(X), 1)[end, 1, :]))
return mse_val
end
# ini of the model
model = Chain(LSTM(6, 70), LSTM(70, 70), LSTM(70, 70), Dense(70, 1, relu)) |> gpu_or_cpu
ps = Flux.params(model)
Flux.reset!(model)
# test loss
loss(data...)
# train one epoch
@time Flux.train!(loss, ps, [data], opt)
Actually I also run out of memory when I use the cpu. So does someone know what I´m doing wrong?
I´m working on Windows 10 with an i7,16 GB Ram and a NVIDIA Quadro P1000 with 4 GB GDDR5.