Hi all,
I need your advice to use automatic differentiation with Optimization.jl
.
The problem - I try to find the optimal variance of a process noise covariance matrix of a standard Kalman filter. Currently, I use gradient-free methods and it works well. However, I would like to use gradient-based methods for the sake of comparison.
MWE - Below is a representative example of the code I have implemented. (Sorry, it is quite long, but I don’t find a way to reduce it).
Code implementing the filter
struct StateSpace{T <: Real}
A::Matrix{T}
C::Matrix{T}
end
struct InitialConditions{T <: Real}
x0::Vector{T}
P0::Matrix{T}
end
struct FilterProblem{T <: Real}
y::Vector{T}
ss::StateSpace{T}
ic::InitialConditions{T}
R::T
end
struct FilterSolution{T <: Real}
x::Matrix{T}
P::Vector{Matrix{T}}
end
function solve(prob::FilterProblem)
# Unpack the problem
(; y, ss, ic, R) = prob
(; A, C) = ss
(; x0, P0) = ic
# Preallocate the state and covariance matrices
nt = length(y)
ny, nx = size(C)
x = similar(x0, nx, nt)
P = [similar(P0) for _ in 1:nt]
xpred = similar(x0)
Ppred = similar(P0)
Kk = similar(x0, nx, ny)
T = similar(P0)
# Initial conditions
x[:, 1] .= x0
P[1] .= P0
optf = OptimizationFunction(optimQ!, Optimization.AutoForwardDiff())
probQ = OptimizationProblem(optf, [-2., -2.], prob, lb = [-20., -20.], ub = [2., 2.])
θ = Optimization.solve(probQ, Optimization.LBFGS()).u
Q = Diagonal(θ)
# Kalman filter algorithm
for k in 2:nt
# Prediction step
xpred .= A*x[:, k - 1]
Ppred .= A*P[k-1]*A' .+ Q
# Update step
ik = y[k] .- C*xpred
Sk = C*Ppred*C' .+ R
Kk = Ppred*(C'/Sk)
T .= I - Kk*C
x[:, k] .= xpred .+ Kk*ik
P[k] .= T*Ppred*T' .+ Kk*R*Kk'
end
return FilterSolution(x, P)
end
function optimQ!(θ, p)
# Unpack the problem
(; y, ss, ic, R) = p
(; A, C) = ss
(; x0, P0) = ic
# Preallocate the state and covariance matrices
nt = length(y)
ny, nx = size(C)
x = similar(x0)
P = similar(P0)
xpred = similar(x0)
Ppred = similar(P0)
Kk = similar(x0, nx, ny)
T = similar(P0)
# Initial conditions
x .= x0
P .= P0
Q = Diagonal(10 .^θ)
# Kalman filter algorithm
J = zero(eltype(x0))
for k in 2:nt
# Prediction step
xpred .= A*x
Ppred .= A*P*A' .+ Q
# Update step
ik = y[k] .- C*xpred
Sk = C*Ppred*C' .+ R
Kk = Ppred*(C'/Sk)
T .= I - Kk*C
x .= xpred .+ Kk*ik
P .= T*Ppred*T' .+ Kk*R*Kk'
J += logdet(Sk) + ik'*(Sk\ik)
end
return J
end
Script - Estimation of the state vector of a spring-mass-damper
using LinearAlgebra, Optimization
using ForwardDiff
includet("./KFilter.jl")
# Model parameters
m = 1. # mass
ω0 = 10π # natural angular frequency
ξ = 0.01 # damping ratio
# Initial conditions
x0 = 1.
v0 = -2.
# Time vector
t = 0:0.01:10.
# Noiseless free response
Ω0 = ω0*√(1 - ξ^2)
Ai = x0
Bi = (v0 + ξ*ω0*x0)/Ω0
y = (Ai*cos.(Ω0*t) .+ Bi*sin.(Ω0*t)).*exp.(-ξ*ω0*t)
v = Ω0*(Bi*cos.(Ω0*t) .- Ai*sin.(Ω0*t)).*exp.(-ξ*ω0*t) .- ξ*ω0*y
# Add noise to the measurements
σ = 0.1
yn = y .+ σ*randn(eltype(y), length(t))
# State-space model
Ac = [0. 1.; -ω0^2 -2ξ*ω0]
Cc = [1. 0.]
Ad = exp(Ac*step(t))
Cd = Cc
ss = StateSpace(Ad, Cd)
ic = InitialConditions([x0, v0], 1e-10*[1. 0.; 0. 1.])
R = σ^2
# Kalman filter
prob = FilterProblem(yn, ss, ic, R)
x = solve(prob).x
Some results
-
With
ForwardDiff.jl
(Optimization.AutoForwardDiff()
), I get the standard errorERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(optimQ!), Float64}, Float64, 2}) The type `Float64` exists, but no method is defined for this combination of argument types when trying to construct it.
As you can see in the code, I tried to make the code generic (it is type stable), but it doesn’t work and I don’t know why
-
With
Zygote.jl
(Optimization.AutoZygote()
), I get this error:Mutating arrays is not supported -- called copyto!(Matrix{Float64}, ...) This error occurs when you ask Zygote to differentiate operations that change the elements of arrays in place (e.g. setting values with x .= ...)
I understand this error, because I use preallocations and broadcasting (but how to avoid this).
-
With
Enzyme.jl
(Optimization.AutoEnzyme()
), I get this warning:TODO forward zero-set of memory copy used memset rather than runtime type
and finally this error:
ERROR: Enzyme compilation failed due to illegal type analysis. This usually indicates the use of a Union type, which is not fully supported with Enzyme.API.strictAliasing set to true [the default]. Ideally, remove the union (which will also make your code faster), or try setting Enzyme.API.strictAliasing!(false) before any autodiff call. To toggle more information for debugging (needed for bug reports), set Enzyme.Compiler.VERBOSE_ERRORS[] = true (default false)
The only
Union
I have is the signature of the for loop.
Finally, for the sake of the completeness, here are some extra info:
Julia Version 1.11.5
Commit 760b2e5b73 (2025-04-14 06:53 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 8 × Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 8 default, 0 interactive, 4 GC (on 8 virtual cores)
Environment:
JULIA_EDITOR = code
[7da242da] Enzyme v0.13.41
[f6369f11] ForwardDiff v0.10.38
[7f7a1694] Optimization v4.2.0
[e88e6eb3] Zygote v0.7.7
If you have any advice for helping me to make it work, it would be great!
Thank you (and sorry for the long topic)