Identification using LowLevelParticleFilters.jl

I am looking for an example (or some guidance) on how to fit a Kalman smoother to data using log-likelihood maximization using LowLevelParticleFilters.jl.

I already have a model structure and have collected a dataset. It is a fairly simple 3-state LTI system, of which I can measure 2. It models the (very slow) thermal dynamics of a building in the presence of several large disturbances. I have decently accurate data on a few of these, like the ambient temperature and the solar irradiance. But others I am still missing, like people and heat gains from electronics. (it is actually very similar to the content described in the tutorials by Baggepinnen)

Now my goals are as follows:

  • Calibrate the model parameters for 24 hour simulation performance. I will use this in an MPC context.
  • Calibrate the process and measurement noise covariance matrices.

I augmented the state-space model with an integrating disturbance, I tried to follow: Disturbance gallery · LowLevelParticleFilters Documentation

I already followed the tutorials on youtube by Baggepinnen. I also read through the documentation, and followed the tutorials on JuliaHub.

First, I understand I need to disretize the linear model. As described in How to Tune a Kalman Filter: Step-by-Step Guide | JuliaHub, we can use the matrix exponential for this. I implemented the discrete dynamics as follows:

function setup_filter(p::Vector{T}) where {T}
    # unpack parameters
    C_i, C_e, C_h, R_ia, R_ie, R_ea, R_ih, A_e, A_w = p[1:9]
    σi, σe, σh, σq = p[10:13] # process noise std dev
    σmi, σmh = p[14:15] # measurement noise std dev (interior thermostat, supply temp)
    Ti0, Te0, Th0 = p[16:18] # initial state 

    # continuous-time state-space matrices (3x3 system)
    A_cont = @SMatrix [
        (-1/R_ia-1/R_ie-1/R_ih)/C_i 1/(C_i*R_ie) 1/(C_i*R_ih);
        1/(C_e*R_ie) (-1/R_ea-1/R_ie)/C_e 0;
        1/(C_h*R_ih) 0 -1/(C_h*R_ih)
    ]
    B_cont = @SMatrix [
        1/(C_i*R_ia) 0 A_w/C_i;
        1/(C_e*R_ea) 0 A_e/C_e;
        0 1/C_h 0
    ]
    G = @SMatrix [1 / C_i; 0; 0]

    # x_aug = [Ti, Te, Th, q]' where q is the integrating disturbance
    A_cont_aug = @SMatrix [
        A_cont[1, 1] A_cont[1, 2] A_cont[1, 3] G[1];
        A_cont[2, 1] A_cont[2, 2] A_cont[2, 3] G[2];
        A_cont[3, 1] A_cont[3, 2] A_cont[3, 3] G[3];
        0.0 0.0 0.0 0.0
    ]

    B_cont_aug = @SMatrix [
        B_cont[1, 1] B_cont[1, 2] B_cont[1, 3];
        B_cont[2, 1] B_cont[2, 2] B_cont[2, 3];
        B_cont[3, 1] B_cont[3, 2] B_cont[3, 3];
        0.0 0.0 0.0
    ]

    # augmented AB matrix for discretization
    AB_cont = zeros(7, 7)
    AB_cont[1:4, 1:4] = Matrix(A_cont_aug)
    AB_cont[1:4, 5:7] = Matrix(B_cont_aug)

    # discretize
    AB_disc = exp(AB_cont * Ts)
    A_disc = SMatrix{4,4}(AB_disc[1:4, 1:4])
    B_disc = SMatrix{4,3}(AB_disc[1:4, 5:7])

    # y = C * x + 0 * u --> output is [Ti, Th]
    C = @SMatrix [1.0 0.0 0.0 0.0; 0.0 0.0 1.0 0.0]

    # process noise covariance R1
    R1 = Diagonal([σi^2 / C_i^2, σe^2 / C_e^2, σh^2 / C_h^2, σq^2 / C_i^2])

    # measurement noise covariance R2
    R2 = Diagonal([σmi^2, σmh^2])

    # initial state distribution d0
    d0 = MvNormal(T.([Ti0, Te0, Th0, 0.0]), Diagonal([1.0, 5.0, 5.0, 1.0]))

    return KalmanFilter(
        A_disc, B_disc, C, zeros(ny, nu),
        R1, R2, d0; Ts, p
    )
end

# Re-initialize with corrected filter
p = [p_init[Param] for Param in
     ["C_i", "C_e", "C_h", "R_ia", "R_ie", "R_ea", "R_ih", "A_e", "A_w",
    "σi", "σe", "σh", "σq", "σmi", "σmh", "Ti0", "Te0", "Th0"]]

kf = setup_filter(p);
measure = LowLevelParticleFilters.measurement(kf)

The problem here is that it doesn’t seem to be possible to autodiff through this, because of the matrix exponential. When I replace the matrix exponential by a simple forward euler step (not ideal for accuracy), I still get autodiff complaints that KalmanFilter constructor in LowLevelParticleFilters.jl is internally calling eigvals() which then fails.

Should I use the c2d(ss) as described here instead? Will that work with autodiff?

Discretization · LowLevelParticleFilters Documentation

I would appreciate some guidance on how to approach this problem.

You can use

c2d(ss(Ac, Bc, Cc, Dc), Ts, :tustin)

for AD compatibility and then pass KalmanFilter(..., check=false) to turn off the check that calls eigvals in the KalmanFilter constructor. This check is just there to catch the user forgetting to perform discretization and can safely be skipped when you know that you have discretized the model.

1 Like

Thank you!

So this setup should in principle work?

function setup_filter(p::Vector{T}) where {T}
    # unpack parameters
    C_i, C_e, C_h, R_ia, R_ie, R_ea, R_ih, A_e, A_w = p[1:9]
    σi, σe, σh, σq = p[10:13] # process noise std dev
    σmi, σmh = p[14:15] # measurement noise std dev (interior thermostat, supply temp)
    Ti0, Te0, Th0 = p[16:18] # initial state 

    # continuous-time state-space matrices (3x3 system)
    A_cont = @SMatrix [
        (-1/R_ia-1/R_ie-1/R_ih)/C_i 1/(C_i*R_ie) 1/(C_i*R_ih);
        1/(C_e*R_ie) (-1/R_ea-1/R_ie)/C_e 0;
        1/(C_h*R_ih) 0 -1/(C_h*R_ih)
    ]
    B_cont = @SMatrix [
        1/(C_i*R_ia) 0 A_w/C_i;
        1/(C_e*R_ea) 0 A_e/C_e;
        0 1/C_h 0
    ]
    G = @SMatrix [1 / C_i; 0; 0]

    # x_aug = [Ti, Te, Th, q]' where q is the integrating disturbance
    A_aug = @SMatrix [
        A_cont[1, 1] A_cont[1, 2] A_cont[1, 3] G[1];
        A_cont[2, 1] A_cont[2, 2] A_cont[2, 3] G[2];
        A_cont[3, 1] A_cont[3, 2] A_cont[3, 3] G[3];
        0.0 0.0 0.0 0.0
    ]

    B_aug = @SMatrix [
        B_cont[1, 1] B_cont[1, 2] B_cont[1, 3];
        B_cont[2, 1] B_cont[2, 2] B_cont[2, 3];
        B_cont[3, 1] B_cont[3, 2] B_cont[3, 3];
        0.0 0.0 0.0
    ]

    # state-space discretization
    C = @SMatrix [1.0 0.0 0.0 0.0; 0.0 0.0 1.0 0.0]
    D = @SMatrix zeros(ny, nu)
    ss_disc = c2d(ss(A_aug, B_aug, C, D), Ts, :tustin)
    A_disc = SMatrix{4,4}(ss_disc.A)
    B_disc = SMatrix{4,3}(ss_disc.B)

    # process noise covariance R1
    R1 = Diagonal([σi^2 / C_i^2, σe^2 / C_e^2, σh^2 / C_h^2, σq^2 / C_i^2]) + 1e-9I

    # measurement noise covariance R2
    R2 = Diagonal([σmi^2, σmh^2]) + 1e-9I

    # initial state distribution d0
    d0 = MvNormal(T.([Ti0, Te0, Th0, 0.0]), Diagonal([1.0, 5.0, 5.0, 1.0]))

    return KalmanFilter(
        A_disc, B_disc, C, D,
        R1, R2, d0; Ts, p, check=false
    )
end

# re-initialize with corrected filter
p = [p_init[Param] for Param in
     ["C_i", "C_e", "C_h", "R_ia", "R_ie", "R_ea", "R_ih", "A_e", "A_w",
    "σi", "σe", "σh", "σq", "σmi", "σmh", "Ti0", "Te0", "Th0"]]

kf = setup_filter(p);
measure = LowLevelParticleFilters.measurement(kf)

With cost function:

function cost(plog::Vector{T}) where T
    p = exp10.(plog)
    kf = setup_filter(p)
    try
        return -LowLevelParticleFilters.loglik(kf, u, y, p)
    catch
        return T(Inf)
    end
end


p0 = log10.(p)
cost(p0)

res = Optim.optimize(
    cost,
    p0,
    BFGS(linesearch=LineSearches.BackTracking()),
    Optim.Options(
        show_trace = true,
        show_every = 5,
        iterations = 1000,
        time_limit = 130,
        f_reltol   = 1e-3,
        f_abstol   = 1e-3,
    ),
    autodiff = :forward,
)

popt = exp10.(res.minimizer)

Unfortunately c2d is throwing an error:

ArgumentError: matrix contains Infs or NaNs

Stacktrace:
  [1] chkfinite
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lapack.jl:105 [inlined]
  [2] getrf!(A::Matrix{Float64}, ipiv::Vector{Int64}; check::Bool)
    @ LinearAlgebra.LAPACK ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lapack.jl:582
  [3] getrf!
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lapack.jl:580 [inlined]
  [4] getrf!
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lapack.jl:787 [inlined]
  [5] #lu!#182
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lu.jl:91 [inlined]
  [6] lu!
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lu.jl:90 [inlined]
  [7] lu!
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lu.jl:89 [inlined]
  [8] _lu
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lu.jl:347 [inlined]
  [9] lu(::Matrix{Float64}; kwargs::@Kwargs{})
    @ LinearAlgebra ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lu.jl:341
 [10] lu
    @ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/lu.jl:341 [inlined]
 [11] \(A::Matrix{Float64}, B::Matrix{Float64})
    @ LinearAlgebra ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/generic.jl:1132
 [12] c2d_x0map(sys::StateSpace{ControlSystemsBase.Continuous, Float64}, Ts::Float64, method::Symbol; w_prewarp::Int64)
    @ ControlSystemsBase ~/.julia/packages/ControlSystemsBase/qOZW0/src/discrete.jl:81
 [13] c2d_x0map
    @ ~/.julia/packages/ControlSystemsBase/qOZW0/src/discrete.jl:43 [inlined]
 [14] c2d(sys::StateSpace{ControlSystemsBase.Continuous, Float64}, Ts::Float64, method::Symbol)
    @ ControlSystemsBase ~/.julia/packages/ControlSystemsBase/qOZW0/src/discrete.jl:32
 [15] setup_filter(p::Vector{Float64})
    @ Main ~/dev/warmteplanner-model/scripts/jl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X62sdnNjb2RlLXJlbW90ZQ==.jl:39
 [16] cost(plog::Vector{Float64})
    @ Main ~/dev/warmteplanner-model/scripts/jl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X65sdnNjb2RlLXJlbW90ZQ==.jl:16
 [17] value!!(obj::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/n7XXO/src/interface.jl:9
 [18] value!(obj::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/n7XXO/src/interface.jl:28
 [19] value!(obj::Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}}, x::Vector{Float64})
    @ Optim ~/.julia/packages/Optim/7krni/src/Manifolds.jl:31
 [20] (::LineSearches.var"#ϕ#1"{Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}})(α::Float64)
    @ LineSearches ~/.julia/packages/LineSearches/b4CwT/src/LineSearches.jl:24
 [21] (::BackTracking{Float64, Int64})(ϕ::LineSearches.var"#ϕ#1"{Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}}, αinitial::Float64, ϕ_0::Float64, dϕ_0::Float64)
    @ LineSearches ~/.julia/packages/LineSearches/b4CwT/src/backtracking.jl:63
 [22] BackTracking
    @ ~/.julia/packages/LineSearches/b4CwT/src/backtracking.jl:34 [inlined]
 [23] BackTracking
    @ ~/.julia/packages/LineSearches/b4CwT/src/backtracking.jl:24 [inlined]
 [24] perform_linesearch!(state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}}, method::BFGS{InitialStatic{Float64}, BackTracking{Float64, Int64}, Nothing, Nothing, Flat}, d::Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}})
    @ Optim ~/.julia/packages/Optim/7krni/src/utilities/perform_linesearch.jl:58
 [25] update_state!(d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}}, method::BFGS{InitialStatic{Float64}, BackTracking{Float64, Int64}, Nothing, Nothing, Flat})
    @ Optim ~/.julia/packages/Optim/7krni/src/multivariate/solvers/first_order/bfgs.jl:143
 [26] optimize(d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, initial_x::Vector{Float64}, method::BFGS{InitialStatic{Float64}, BackTracking{Float64, Int64}, Nothing, Nothing, Flat}, options::Optim.Options{Float64, Nothing}, state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}})
    @ Optim ~/.julia/packages/Optim/7krni/src/multivariate/optimize/optimize.jl:65
 [27] optimize
    @ ~/.julia/packages/Optim/7krni/src/multivariate/optimize/optimize.jl:43 [inlined]
 [28] #optimize#77
    @ ~/.julia/packages/Optim/7krni/src/multivariate/optimize/interface.jl:274 [inlined]
 [29] top-level scope
    @ ~/dev/warmteplanner-model/scripts/jl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X65sdnNjb2RlLXJlbW90ZQ==.jl:29

I have also tried :fweuler. That does work and converges. However the estimate is somewhat nonsensical.

I think that is related to the issue with the cost function here: QuanserInterface.jl/examples/pendulum_identification.jl at main · JuliaComputing/QuanserInterface.jl

The line search can sometimes take too long steps, testing parameters that are no good. I sometimes have luck with

using Optim, Optim.LineSearches
...
BFGS(alphaguess = LineSearches.InitialStatic(alpha=0.1))

or some other small value.

You could move the setup_filter also inside the try-catch to guard against stuff like that

Thanks, I can try that.

If I want to optimize for 24 hour simulation performance, I guess I can’t use LowLevelParticleFilters.loglik because it will use all the measured data while filtering as well?

Optimization of a several-step prediction objective is rather expensive, you would essentially have to implement a manual loop that performs the filtering, and at each time step performs 24h worth of prediction to yield the predicted value to compare to data in the cost function. In other words, the O(N_{data}) complexity of one-step prediction minimization turns in to O(N_{data}N_{pred}). This is certainly not impossible, and if you don’t have too fast sampling it might not be very slow either

I think this method is described here already, if that is what you are referring to? Identification of unstable systems · ControlSystemIdentification Documentation

The commonly used sampling rate in literature is only 15 minutes. I have even seen papers that use 30 minutes or even 60 minutes. Should be feasible.

No not quite. This page illustrates problems with simulation of unstable systems, but your system is probably not unstable so it’s not relevant. Pure simulation-error minimization also have O(N_{data}) complexity, it’s just the combination of filtering and simulation (multi-step prediction) that happens to be expensive to compute.

I think I got confused somewhere along the way. I don’t think filtering is important when optimizing for simulation performance. When you’re simulating 24 hours into the future, you don’t have access to any measured outputs nor any measured disturbances besides some forecasts like irradiance and ambient temperature. I think that ‘tuning the covariance matrices’ only makes sense if you’re continuously filtering? I think I’m a bit lost :smiley:

If you have a data set that spans much more than 24 hours you need to filter up until the point where each 24hrs simulation starts. if you’re familiar with model-predictive control, think of the filtering as whatever is leading up to the “current point in time”, and the simulation into the future as the simulation performed in the optimizer to optimize the future performance over the prediction horizon. It would be easier to explain this with a whiteboard, but I hope you get the picture, at each time point along the trajectory you start a 24hr simulation from the current filter estimate.

Simulation in this context is the same as repeatedly running the prediction step in the Kalman filter, without running the correction step where the measurements are incorporated.

1 Like