DataDrivenDiffEq input question

I was playing with sparse regression within DataDrivenDiffEq.jl and noticed that when I input a differential equation solution object vs. the discrete data contained in that solution I get wildly different answers. The ODE solution object gives me the answer I would expect whereas the pure discrete data is incorrect. I’m pretty sure the ODE solution has some kind of interpolator built in, but I also could just be inputting the discrete data incorrectly. Any help would be appreciated thanks!

ddprob = DataDrivenProblem(sol)

vs.

ddprob = DataDrivenProblem(sol[:,:],t=sol.t)

ODE given by:

function newton(u,p,t)
x, v = u
dx = v
dv = (1/mass)*F_model(x,p)
return [dx, dv]
end

Yes, it can use interpolation, see here.

As default, the solution gets evaluated over all the timepoints using the underlying function of the differential equation system.

Your first case produces an DiscreteDataDrivenProblem, given that you do not use ContinuousDataDrivenProblem as a constructor. Given that you are trying to learn a ODE, you would need to provide the derivatives, e.g. by using DX = Array(sol.(sol.t, Val{1})) . The constructor using the solution is able to infer that you are using a continuous solution here and will handle this.

1 Like

Ok cool thanks!

Forgive my ignorance but I’m still getting rather different answers even after I change the code. The first method just passing in the ODE solution gives 3 parameters but using the code below gives me a solution with 42 parameters. Looking at that source code the difference is in the method to get DX. In the source you linked lines 489-497 just evaluate the ODE at X to get the derivative. Is there someway to match the accuracy of that derivative without knowing the form of the ODE? I think a UDE would probably work but I’d like to avoid that much machinery if possible.

X = Array(sol)
DX = Array(sol(sol.t, Val{1}))
ddprob = ContinuousDataDrivenProblem(X, sol.t, DX=DX, InterpolationMethod(), p = sol.prob.p)

You don’t need to specify the interpolation method here. Can you provide me with a more in depth example I can tweak? The right signature for a ContinuousDataDrivenProblem is X, t, DX as arguments ( no DX = ).

For the system I am using now I have data for DX so this is purely curiosity. I was thinking about a future case where I cannot define an ODE what the best way to get DX would be. I pasted an example below. Method 1 is auto detected by the source code as having a method to calculate DX, but in method 2 the source code uses interpolation via DX = Array(sol.(sol.t, Val{1})). Method 2 gives the wrong answer in this case and I was just wondering if there was a method better than interpolation but did not require you to know all of the physics behind the problem. I’m pretty sure a UDE replacing F_model is one way but I did not want to do that if it could be avoided. I also thought maybe use ApproxFun with the basis defined for the DataDrivenProblem but also wanted to see if you had any thoughts.

Thanks!

const mass = 1.0
function F_model(x, params)
return -1*(params[1] + 2params[2]x + 3params[3](x^2) + 4params[4](x^3))
end

function newton(u,p,t)
x, v = u
dx = v
dv = (1/mass)*F_model(x,p)
return [dx, dv]
end

u0 = [1.0;0.0]
tspan = (0.0,2.0)
dt = 0.01
true_params = [1.469,-2*0.057,0.0,0.0]
prob = ODEProblem(newton,u0,tspan,true_params)
sol = solve(prob, Tsit5(), saveat = dt)

Method 1: Using ODE to get DX

ddprob = DataDrivenProblem(sol)

@variables t x(t) v(t)
u = [x; v]
basis = Basis(polynomial_basis(u, 5), u, iv = t)
opt = STLSQ(exp10.(-10:0.1:-1))
ddsol = solve(ddprob, basis, opt, options = DataDrivenCommonOptions(digits = 1))
println(get_basis(ddsol))

Method 2: Interpolation

X = Array(sol)
ddprob = ContinuousDataDrivenProblem(X, sol.t)

@variables t x(t) v(t)
u = [x; v]
basis = Basis(polynomial_basis(u, 5), u, iv = t)
opt = STLSQ(exp10.(-10:0.1:-1))
ddsol = solve(ddprob, basis, opt, options = DataDrivenCommonOptions(digits = 1))
println(get_basis(ddsol))