Forecast Time series using Neural ODE for supervised learning

I’m aiming for time series prediction. I’ve seen a good prediction of time series through the post forcast weather NODE(“Forecasting the weather with neural ODEs” taked Sebastian Callh personal blog and “Forcasting time series data with neural ordinary differential equations is Julia” writed dmjalal90 in this community). I’m just trying to take advantage of this, but what I’m doing is when there’s only time series data (many examples are Amazon Stock Market Forecast (LSTM), where input_dim(sequence or n_in) is 12, and output_dim(horizon or n_out) is 36.

The data being held is monthly data, that is, it is a problem of predicting 36 months later with data from 12 months ago. To do this, there is only time series data, so it was created through the create sequence function to create X data and Y data. In addition, as in machine learning, I’m trying to learn by MSE-loss between the predicted Y and True Y values as a regression problem.

However, I’m trying to modify the code by applying it in the forecast weather NODE function, and if the input dim (12) and the output dim (36) are not the same, the dimension error continues to appear. Furthermore, the feature called pc1 was used now, but later, as each feature is learned, all columns in the data will be individually learned through the Hyperparameter tuning process accordingly.

I’m new to Julia, and I mainly use Python, but I don’t have a NODE package in Python, so I’m leaving a post just in case. I’m just using the version this person used to try to get the Forecast weather Julia code back. Masters, please help me plz… :frowning:

As an additional question,
+1) Is it correct that I did supervised learning?
+2) Since it is defined as dh/dt as stated in the Neural ODE paper, if we use rnn cell instead of the existing sense layer, is it the model defined as the rate of change of the state in the paper?
+3) I heard that Julia should make batch size as a function herself and use it, is it correct?
--------------------------------------------------------------------------------------------------------data


using Random
using Dates
using Optimization
using Lux
using DiffEqFlux: NeuralODE, ADAMW, swish
using DifferentialEquations
using ComponentArrays
using BSON: @save, @load
using CSV
using DataFrames
using Statistics
using Plots
using OptimizationOptimisers

#"info “Data Pre-processing…”
df = CSV.read(“C:\Users\jhlee\Desktop\Julia\NODE\data\del_t_df.csv”, DataFrame)

n_train = 1488

df_train = df[1:n_train, :] # period : 1870-01-01 ~ 1993-12-01
df_valid = df[n_train+1 : n_train + 180, :] # period : 1994-01-01 ~ 2008-12-01
df_test = df[n_train+180+1 : end, :] # period : 2009-01-01 ~ 2023-12-01

features = [:pc1] ## select specific features
t_and_y(df) = df[!, :date]‘, Matrix(df[:, features])’

t_train, train_data = t_and_y(df_train) # (1, 1488), (1, 1488)
t_valid, valid_data = t_and_y(df_valid) # (1, 180), (1, 180)
t_test, test_data = t_and_y(df_test) # (1, 180), (1, 180)

t_train = vec(t_and_y(df_train)[1]) # convert to 1-dim vector : (1488,)
t_valid = vec(t_and_y(df_valid)[1]) # convert to 1-dim vector : (180,)
t_test = vec(t_and_y(df_test)[1]) # convert to 1-dim vector : (180,)

function create_sequences(dataset, input_dim::Int, output_dim::Int)
X =
y =
for i in 1:(length(dataset) - (input_dim + output_dim) + 1)
X_seq = dataset[i:i+input_dim-1]
y_seq = dataset[i+input_dim:i+input_dim+output_dim-1]
push!(X, X_seq)
push!(y, y_seq)
end
return hcat(X…), hcat(y…)
end

input_dim = 12 ## input month
output_dim = 12 ## target month

X_train, Y_train = create_sequences(train_data, input_dim, output_dim)
X_valid, Y_valid = create_sequences(valid_data, input_dim, output_dim)
X_test, Y_test = create_sequences(test_data, input_dim, output_dim)

t_train = t_train[1+input_dim:end-output_dim+1] # split times
t_valid = t_valid[1+input_dim:end-output_dim+1]
t_test = t_test[1+input_dim:end-output_dim+1]

println(size(t_train), size(X_train), size(Y_train))
println(size(t_valid), size(X_valid), size(Y_valid))
println(size(t_test), size(X_test), size(Y_test))

(1465,)(12, 1465)(12, 1465)

(157,)(12, 157)(12, 157)

(157,)(12, 157)(12, 157)

function minmax(x, x_min, x_max)
z = (x .- x_min) ./ (x_max - x_min)
return z
end

function rescale_minmax(z, x_min, x_max)
x = (z .* (x_max - x_min)) .+ x_min
return x
end

t_min, t_max = extrema(t_train)
X_min, X_max = extrema(X_train)

X_train = minmax(X_train, X_min, X_max)
X_valid = minmax(X_valid, X_min, X_max)
X_test = minmax(X_test, X_min, X_max)

t_train = minmax(t_train, t_min, t_max)
t_valid = minmax(t_valid, t_min, t_max)
t_test = minmax(t_test, t_min, t_max)

println(extrema(X_train), “|”, extrema(t_train))
println(extrema(X_valid), “|”, extrema(t_valid))
println(extrema(X_test), “|”, extrema(t_test))

function neural_ode(input_dim, output_dim, t)
f = Lux.Chain(
Lux.Dense(input_dim, 64, swish), # 입력 차원 → 64
Lux.Dense(64, 32, swish), # 64 → 32
Lux.Dense(32, output_dim) # 32 → 출력 차원
)

node = NeuralODE(
    f, extrema(t), Tsit5(),
    saveat=t,
    abstol=1e-9, reltol=1e-9
)

rng = Random.default_rng()
p, state = Lux.setup(rng, f)

return node, ComponentArray(p), state

end

function train_supervised(X, y, t, maxiters, lr, rng; kwargs…)
input_dim, output_dim = size(X, 1), size(y, 1)
node, θ, state = neural_ode(input_dim, output_dim, t)

# 손실 함수 정의
function loss(θ)
    y_pred = Array(node(X, θ, state)[1])  
    return sum(abs2, y_pred .- y)
end

adtype = Optimization.AutoZygote()
optf = OptimizationFunction((θ, p) -> loss(θ), adtype)
optprob = OptimizationProblem(optf, θ)
res = solve(optprob, ADAMW(lr), maxiters=maxiters; kwargs...)

return res.minimizer, state

end

function predict(X, t, θ, state)
node, _, _ = neural_ode(size(X, 1), size(Y_train, 1), t)
return Array(node(X, θ, state)[1])
end

#"info “Training supervised model…”
maxiters = 150
lr = 4e-3
rng = MersenneTwister(123)
θ, state = train_supervised(X_train, Y_train, t_train, maxiters, lr, rng)

yhat = predict(X_test, t_test, θ, state)