Error during Optimization using Neural ODEs

Yes the data are simulated but I wanted to use this approach for real cases. Thanks anyway for the help. My ultimate goal is to use it to work on ode generated with real apartment data And also parameters from real apartment so I think neural ode are the best approach?

Heat transfer is a linear phenomenon I doubt you need anything fancy to model that. It would be like using deep learning to fit a straight line.

How far into the future do you need to predict the temperature?

My goal is just that. Especially to see if with methods like neural ode you can have longer future predictions with less training data. In Models involving houses the data are sometimes not accurate, so models can help to have more accurate future predictions than standard LSTM

If you’ve worked with models like the Rc models before and have any other examples to show me I’d appreciate it

Well, you can think about the problem like this, if you do have access to a good measurement of the outdoor temperature, the problem is linear and trivial to solve. Without access to the outdoor temperature, the only thing you can do is improve your prediction of the outdoor temperature, a job that the weather agency likely does better :wink:

You could argue that you can perform short-term predictions of weather that is more accurate than what your local weather forecaster does, but for deep learning to be the correct methodology here, you’d obviously need a ton of data, probably at least a decade’s worth.

If solar irradiation is a factor, this is an area where you can also improve slightly, but once again, deep learning would require a lot of data to do that well.

What you are trying to model is often called a “disturbance observer”, I’d look into simple disturbance Observers before considering neural ODEs. If you have some real data at hand, and know how far into the future you need to predict accurately (an hour, a day?) I can show an example of estimating one.

What is your ultimate goal with the modeling? If you’re interested in controlling indoor temperature, we have something in common :blush:

I recently wrote a little tutorial using MPC to this purpose and am halfway towards implementing it in practice in my own house. I also have the house modeling left to do, but I first need to get my hands on a new indoor temp sensor :blush:

I want to define a model to find how temperature varies and then monitor consumption in a house having other variables and parameters. Obviously I have the internal temperature data and I can use it for example to optimize the parameters of my ode. Once I optimize the parameter with the physical model I generate the data and do training and forecasting with that. Exactly Deep learning examples have needs of lots of data. Seeing the examples neural ode offer a new way to be able to have predictions in the long-term future. Then my goal for the thesis is to use Sciml methods precisely, maybe they are not the best eh.

Classic control methods are good. Use them. Neural networks should be a last resort when you don’t know things.

The attractive feature of SciML is that you can use the available science to the extent possible, and use data to improve the performance where the science is lacking. Or in other words, SciML allows you to make better predictions by using as little ML as possible.

If all the science you’re putting into the model is the RC properties, you leave quite a lot left over for the ML. There is a lot more structure to this problem you can explore before throwing a neural network at it.

Thank you very much for your help! I will try multiple shooting and then try this, do you have any links to recommend? I need to create models like the one in the example I posted

update: now the train seems to have good results but the prediction with updated parameters does not, am I doing something wrong?

using DifferentialEquations, Plots, Flux,Optim, DiffEqFlux, DataInterpolations,Random, ComponentArrays, Lux
using Optimization, OptimizationOptimisers, OptimizationOptimJL,OptimizationNLopt
rng = Random.default_rng()
using CSV
using DataFrames
using Plots
using Flux
using Optim: LineSearches, BFGS
using Statistics: mean,sqrt!,abs2
using DiffEqFlux, Optimization, OptimizationOptimJL,Plots
using ComponentArrays, Lux, DiffEqFlux, Optimization, OptimizationPolyalgorithms, DifferentialEquations, Plots
using DiffEqFlux: group_ranges
using WeightInitializers: truncated_normal
# Load the data
df = CSV.read("es_01b_real_data_.csv", DataFrame)
df=repeat(df, outer=1)
end_param=10001
registered_gas_flow = df[1:10:end_param, :2]
Registered_Temperature = df[1:10:end_param, :3]
tsteps= df[1:10:end_param, :1]

gas_flow = LinearInterpolation(registered_gas_flow,tsteps);
function ext_flow(tsteps)
    return gas_flow(tsteps)
end
function water_temp(tsteps)
    return Temperature_h20(tsteps)
end

gas_flow = LinearInterpolation(registered_gas_flow,tsteps);
Temperature_h20=LinearInterpolation(Registered_Temperature,tsteps);


#create a 3600 time vector
function RC!(du,u,p,t)
    x = u[1]
    A,B,C,D = p
    P= ext_flow(t)
    du[1]=dx=(1/A)*(B* P - C + D * (20 -x))
    end

u0= [20.0]



tspan= (0.0f0,1000.0f0)
global p = [66.896,50e6,100.0,0.2]; # p, M, e

prob= ODEProblem(RC!, u0, tspan, p)
ode_data=Array(solve(prob,Tsit5(),saveat=tsteps))

const nn = Lux.Chain(Base.Fix1(broadcast, cos),
    Lux.Dense(1 => 32, cos; init_weight=truncated_normal(; std=1e-4)),
    Lux.Dense(32 => 32, cos; init_weight=truncated_normal(; std=1e-4)),
    Lux.Dense(32 => 1; init_weight=truncated_normal(; std=1e-4)))
ps, st = Lux.setup(MersenneTwister(), nn)

const params = ComponentArray{Float64}(ps)

function ODE_model(u, nn_params, t)
    P= ext_flow(t)
    A,B,C,D = p
    y =(1/A).*(B.* P .- C .+ D.*(first(nn([first(u)], nn_params, st))))
end

prob_nn = ODEProblem(ODE_model, u0, tspan, params)
soln_nn = Array(solve(prob_nn,saveat=tsteps))

plot(soln_nn[1,:], label="NN")
display(plot!(ode_data[1,:], label="ODE"))

function loss(θ)
    pred = Array(solve(prob_nn, Tsit5(); u0, p=θ, saveat=tsteps))
    loss = sqrt(mean(abs2.(ode_data[1,:] .- pred[1,:])))
    return loss, pred
end

loss(params)

const losses = Float64[]

function callback(θ, l, pred)
    push!(losses, l)
    println("Training || Iteration: $(length(losses)) || Loss: $(l)")
    return false
end

adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, params)
res = Optimization.solve(optprob,Optimisers.Adam(); callback, maxiters=30)
optprob2 = remake(optprob, u0 = res.u)
res2 = Optimization.solve(optprob2,
    BFGS(; initial_stepnorm=0.01, linesearch=LineSearches.MoreThuente());
    callback, maxiters=1000)

plot(losses)
display(plot!(xlabel="Iteration", ylabel="Loss"))

prob_nn = ODEProblem(ODE_model, u0, tspan, res2.u)
soln_nn =Array(solve(prob_nn, Tsit5(); u0, p=res2.u, saveat=tsteps))

plot(soln_nn[1,:], label="NN trained")
display(plot!(ode_data[1,:], label="ODE"))

end_test=20001
test_data = df[end_param:10:end_test, :2]
tsteps_test= df[end_param:10:end_test, :1]

gas_flow_test = LinearInterpolation(test_data,tsteps_test);
function ext_flow_test(tsteps_test)
    return gas_flow_test(tsteps_test)
end


#create a 3600 time vector
function RC_test(du,u,p,t)
    x = u[1]
    A,B,C,D = p
    P= ext_flow_test(t)
    du[1]=dx=1/A*(B* P - C + D * (20 -x))
    end


tspan_test= (tspan[2],2000.0f0)
u1=[ode_data[1,end]]

function ODE_model_test(u, nn_params, t)
    P= ext_flow_test(t)
    A,B,C,D = p
    y =(1/A).*(B.* P .- C .+ D.*(first(nn([first(u)], nn_params, st))))
end

prob_test= ODEProblem(RC_test, u1, tspan_test, p)
ode_data_test=Array(solve(prob_test,saveat=tsteps_test))

prob_nn_test = ODEProblem(ODE_model_test, u1, tspan_test, res2.u)
soln_nn_test = Array(solve(prob_nn_test, Tsit5(); u0, p=res2.u, saveat=tsteps_test))

plot(tsteps,ode_data[1,:], label="ode_data_train")
plot!(tsteps_test,ode_data_test[1,:], label="ode_data_test")
plot!(tsteps,soln_nn[1,:], label="NN trained_train")
display(plot!(tsteps_test,soln_nn_test[1,:], label="NN trained_test"))

RMSE_test=sqrt(mean(abs2.(ode_data_test[1,:] .- soln_nn_test[1,:])))

PREDICTION
NN_TRAINED

Now in my code I am using a constant outside temperature of 20°C, a simpler case than last time

If it fits the training data but doesn’t extrapolate well, then it’s not the numerics that’s an issue but the choice of model. Neural ODEs (and machine learning) doesn’t extrapolate well, so I’m not sure this is surprising.

I didn’t understand Chris, can you explain?

Neural ODEs can fit the data while not extrapolating correctly. They are inherently overparameterized so they can learn to fit the data with an incorrect model. Use a better model for better predictions.

do I try to change the neural ode model? so I change the function that defines my neural ode?

For small prediction periods instead it works, but after a while it does not

Yes not surprising, see some of my talks on it:

Seeing my model, do you find anything strange? Do you suggest any particular modifications? Thanks for the link, I will look at it now