Multivariate OU parameter estimation

jacob-roth · June 2, 2022, 5:50pm

I am interested in estimating the parameters M,D\in\mathbb{R}^{n\times n}, D\succeq0 of an OU process

dX_t = M(X_t-\mu)d_t + DdW_t

from observed data at frequency \Delta t.

Is there a generalized likelihood inference procedure in the SDE DiffEq package that is similar to the one used for DiffEqs (Generalized Likelihood Inference · DiffEqParamEstim.jl) but with a single dataset (i.e., if we only have observed one “run” of the SDE)?

Here is my approach so far (following Parameter Estimation for Stochastic Differential Equations and Ensembles · DiffEqParamEstim.jl), but it doesn’t seem to work, as the optimization procedure just returns the initial point:

using DifferentialEquations
using DiffEqParamEstim
using LinearAlgebra
using RecursiveArrayTools
using Optim
using PyPlot
fig, ax = subplots()

# parameters
M = [-1.0; 0.0; -1.0]
D = [1.0; 0.1; 1.0]
μ = [5.0; 10.0]
p = [vec(M); vec(D); vec(μ)]
dt = 0.001
tspan = (0.0, 1.0)

# drift and diffusion
f(du, u, p, t) = begin
  M = reshape([p[1:2]; p[2:3]], (2, 2))
  μ = p[7:8]
  du .= M * (u - μ)
end
g(du, u, p, t) = begin
  D = reshape([p[4:5]; p[5:6]], (2, 2))
  du .= D
end

Set up the problem

# problem
prob = SDEProblem(f, g, μ, tspan, p, noise_rate_prototype=zeros(2, 2))
sol = solve(prob, EM(), dt=dt)
ax.cla()
ax.plot(sol.u)

Set up the estimation

# data
t = collect(range(0, stop=1, length=200))
function generate_data(t)
  sol = solve(prob, EM(), dt=dt)
  randomized = VectorOfArray([(sol(t[i]) + 0.01randn(2)) for i in 1:length(t)])
  data = convert(Array, randomized)
end
ntraj = 200 # for now, doing more trajectories than a single run
aggregate_data = convert(Array, VectorOfArray([generate_data(t) for i in 1:ntraj]))

# estimation
p0 = [0.4vec([-1.0; 0.1; -1.0]); 0.4vec([1.0; 0.1; 1.0]); 0.4vec([1.0; 1.0])]
monte_prob = EnsembleProblem(prob)
# obj = build_loss_objective(monte_prob, SOSRA2(), L2Loss(t, aggregate_data),
#   maxiters=10000, verbose=false, trajectories=ntraj,
#   parallel_type=:threads)
obj = build_loss_objective(monte_prob, EM(), dt=dt, L2Loss(t, aggregate_data),
  maxiters=10000, verbose=false, trajectories=ntraj,
  parallel_type=:threads)
result = Optim.optimize(obj, p0, Optim.BFGS())

Show the result

pstar = Optim.minimizer(result)
hcat(p0, pstar, p)
julia> hcat(p0, pstar, p)
8×3 Matrix{Float64}:
 -0.4   -0.39996    -1.0
  0.04   0.0400358   0.0
 -0.4   -0.400066   -1.0
  0.4    0.40028     1.0
  0.04   0.0400151   0.1
  0.4    0.40048     1.0
  0.4    0.399804    5.0
  0.4    0.400003   10.0

mschauer · June 2, 2022, 6:46pm

Maybe you can have it easy: I longer time ago I implemented the transition density of a OU process in Bridge.jl. So you can just have a likelihood and use a sampler. Let’s generate some toy data:

using Bridge
using LinearAlgebra
M = -I + 0.1*randn(5, 5)
@assert all(real.(eigvals(M)) .< 0) # otherwise explosion
D = randn(5, 5)
β = zeros(5)
P = LinPro(M, β, D)
x0 = zeros(5)
tt = collect(0:0.4:100.0)

# sorry some fixes for out of date code
Bridge.cholupper(a) = cholesky(Symmetric(a)).U
Bridge.rand(P::Bridge.Gaussian{<:Vector}) = P.μ + Bridge.cholupper(P.Σ)'*randn(eltype(P.μ), length(P.μ))

function generate(tt, x0, P)
    x = copy(x0)
    yy = [x]
    for i in eachindex(tt)[2:end]
        x = rand(transitionprob(tt[i-1], x, tt[i], P))
        push!(yy, x)
    end
    tt, yy
end
tt, yy = generate(tt, x0, P)

Now we can evaluate the log-likelihood for any M and D:

function loglikelihood(tt, yy, P)
    ll = 0.0
    for i in eachindex(tt)[2:end]
        ll += lp(tt[i-1], yy[i-1], tt[i], yy[i], P)
    end
    ll
end
loglikelihood(tt, yy, LinPro(M, β, D))

ChrisRackauckas · June 3, 2022, 12:33am

Indeed if it’s just an OU, using Bridge.jl in a way that’s specialized directly for OU will be much more efficient than using something for general stochastic differential equations. That’s definitely the right way to go.

Topic		Replies	Views
Simulation and Inference for Stochastic Processes General Usage question	3	1230	March 19, 2019
Losses for parameter estimation of Neural SDE Machine Learning sde	15	1752	March 1, 2021
ANN: Bridge - A statistical toolbox for diffusion processes Community package , announcement , statistics	1	1261	August 16, 2017
Estimating the parameters of ODEs with multiple initial conditions using Bayesian methods Modelling & Simulations question , diffeq	2	515	October 7, 2020
2D Lotka Volterra parameter estimation with only 1D data New to Julia diffeq , optim	7	1894	February 6, 2020

Multivariate OU parameter estimation

Related topics