Problems with implementing a basic DeepAR algorithm in Julia

josemanuel22 · September 29, 2023, 5:52pm

I am trying to implement a basic DeepAR that serves as a baseline to compare with other time series forecasting algorithms. However, when testing the following code, it seems that the code does not learn well an example AR(3) autoregressive model that I have put in. Therefore, I have doubts about whether what I have done is correct, if there is any error. Could someone help me? I am not very familiar with the architecture of DeepAR.

losses = []
optim = Flux.setup(Flux.Adam(1e-2), model)
@showprogress for (batch_Xₜ, batch_Xₜ₊₁) in zip(loaderXtrain, loaderYtrain)
    loss, grads = Flux.withgradient(model) do m
        likelihood = 0
        Flux.reset!(m)
        model([batch_Xₜ[1]])
        for (x, y) in zip(batch_Xₜ[2:end], batch_Xₜ₊₁[2:end])
            μ, logσ = model([x])
            σ = softplus(logσ)
            ŷ = rand(Normal(μ, σ))
            likelihood = log(sqrt(2 * π)) + log(σ) + ((y - ŷ)^2 /(2 * σ^2)) + likelihood
        end
        -likelihood/length(batch_Xₜ)
    end
    Flux.update!(optim, model, grads[1])
    push!(losses, loss)
end

gdalle · September 29, 2023, 7:56pm

Can you provide a complete reproducible example? What makes you say it does not learn well?

josemanuel22 · September 29, 2023, 9:34pm

Thank you very much for your message! Actually, the rest of the code is quite simple. I am attaching it.

ar_hparams = ARParams(;
    ϕ=[0.5f0, 0.3f0, 0.2f0],
    x₁=rand(Normal(0.0f0, 1.0f0)),
    proclen=20000,
    noise=Normal(0.0f0, 0.2f0),
)

n_series = 100

loaderXtrain, loaderYtrain, loaderXtest, loaderYtest = generate_batch_train_test_data(
    n_series, ar_hparams
)

model = Chain(
    RNN(1 => 10, relu), RNN(10 => 10, relu), Dense(10 => 16, relu), Dense(16 => 2, identity)
)

ARParams are the parameters for the autoregressive process and generate_batch_train_test_data is a function that generates n_series realizations of the autoregressive process. loaderXtrain are the autoregressive series and loaderYtrain are the same series shifted to the right by one step. That is, if loaderXtrain[1] (it would actually be collect(loaderXtrain)[1]) represents an autoregressive process X_t, loaderYtrain[1] would be the process X_{t+1}.

Well, I think it doesn’t seem to be learning since the loss after training is the following (I’ve tried with different learning rates and parameters and the results are the same). Obviously, looking at how it approximates the series used for training, it doesn’t manage to get good results; in fact, it seems that the solution after training is practically identical to the solution before training.

I’m sorry if the example is not minimal. If you need more information, I will try to upload a minimal, reproducible example tomorrow. The thing is that the code for generating the AR(p) process has turned out to be longer than it should. I just wanted to know if I’ve made some big mistake that I’m not seeing, or if DEEPAR is supposed to perform this poorly.

Topic		Replies	Views
Simple Flux LSTM for Time Series Machine Learning question , flux , time-series , machine-learning	62	13532	April 11, 2022
DeepAR in Julia Machine Learning	3	758	April 6, 2024
Flux Learning basics New to Julia	0	370	November 24, 2018
Looking for a Flux RNN tutorial Teaching & Outreach	0	538	November 25, 2020
RNN Not learning Machine Learning	9	1250	January 4, 2021

Problems with implementing a basic DeepAR algorithm in Julia

Related topics