[ANN] QuadraticKalman.jl - Quadratic State-Space Filtering

[ANN] QuadraticKalman.jl: Kalman Filtering for Quadratic State-Space Models

I’m excited to announce the first release (and my first package) of QuadraticKalman.jl, a Julia package that implements Kalman filtering and smoothing for state-space models with quadratic measurement equations.

Overview

QuadraticKalman.jl implements the Quadratic Kalman Filter from Monfort et. al. (2015, Journal of Econometrics) for the following state-space model with Gaussian noise:

State equation:

X_{t} = \mu + \Phi X_{t-1} + \Omega \epsilon_t, \quad \epsilon_t \sim \mathcal{N}(\mathbf{0},\mathbb{I}_N)

Measurement Equation:

Y_t = A + B X_t + \alpha Y_{t-1} + \sum_{i=1}^M e_i X_t^\prime C_i X_t + D \eta_t, \quad \eta_t \sim \mathcal{N}(\mathbf{0}, \mathbb{I}_M), e_i := \mathbb{I}_M[:,i]

The Quadratic Kalman Filter is an improvement over naively adding quadratic terms to the classic Kalman Filter by maintaining the structure of the covariance matrix in the linearized state-space. Although less flexible than the EKF, UKF, and Quadrature Kalman Filter, the QKF is computationally cheaper and more accurate when the measurement equation is known to be quadratic.

Key Features

  • Kalman filtering and smoothing for quadratic state-space models
  • Extends original implementation with autoregressive measurement equations
  • Automatic differentiation of negative log-likelihood using ForwardDiff.jl
  • Visualization tools for filtered and smoothed states
  • Efficient parameter-model conversion for optimization
  • Automatic reparametrization for positive-definite covariance matrices
  • 9x - 62x faster than equivalent R implementation
  • Numerically stable implementation

Future Planned Features

  • A mutating version of the filter optimized for memory allocations
  • Extend the filter for state-dependent noise
  • Compatibility with ReverseDiff.jl

Edit: Changed R comparison after properly benchmarking

7 Likes

Congratulations, this looks very cool. The example docs look great! I like the nice plotting of the uncertainty.

8.8x faster than equivalent R implementation

Normally I’d expect R to be 100x slower, because this kind of iterative matrix mul-adds is usually pretty slow. Do you know if that version is calling a C library? The equations look well-suited to Julia, which should then be about same speed as C. Or is there something in the computation that’s bogging down Julia?

1 Like

Thanks @apo383! I appreciate the complements and feedback.

I agree 8.8x is a little slow. For the initial release I optimized the in-place version of the filter and smoother for readability, design, and AD-friendliness. The 8.8x result is for the in-place version. Once I get around to optimizing the mutating version that will likely be much faster. And then I will need to do another iteration to get that code to be AD-friendly which will likely mean trying to get it to play with Enzyme.jl. I tried it once through Enzyme and it caused a crash, but I’ve done some work previously on this and think I can get it to work.

The key bottleneck I faced is a robustness technique to correct for slightly non-PSD covariance update matrices where you replace the negative eigenvalues to 0 and reconstruct the matrix. Currently I maintain AD-friendliness through that routine by using ForwardDiff.jl which can handle eigen(Symmetric(A)).

I’d definitely welcome tips on how to write code that is both AD-friendly and limits memory allocation.

That does sound like a likely bottleneck that probably swamps everything else. I briefly looked at the paper you referenced, and the algorithm equations look pretty straightforward, so the eigenvalue fixing seems like the only demanding issue.

I imagine the R code is using BLAS or something, but have no idea how they might be doing AD, which I thought was quite limited in R.

I briefly looked through QuadraticKalman.jl and saw a function d_eigen, but couldn’t find it used anywhere. Are d_eigen and AD are for future use? You may find DifferentiationInterface.jl helpful for trying out various approaches.

Probably a question for the original authors, but I wonder why they had to do the eigenvalue trick. A usual way to avoid that would be a square root filter, updating the covariance square root so that the covariance is always PDF. I’m imagining that they couldn’t formulate it in that way.

So I implemented some proper bench-marking and was underselling myself. For some reason in the 2-dimensional case the speedup is around 10x, but it is around 60x when looking at the scalar case and 25x in the 5-dimensional case.

I think a square-root version of the filter would have a much larger augmented state-space maybe posing a challenge for the R code. It would be interesting to see if Julia could handle it with reasonable speed.

And yes, d_eig is an artifact of an older version of the filter I tried to make work with Ezyme.jl.