Eigenvalue sensitivites using autodiff

Hofsmo · April 26, 2022, 1:12pm

Hi,
I am quite new to both julia and autodiff and was wondering if someone can point me in the right direction. I have a dynamic system described using state space representation. I want to calculate the derivative of the eigenvalues of the system matrix A with respect to one parameter a_{12}.

The system matrix can have many states, but for testing I made a small matrix A=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & 0\end{bmatrix}. For this system I can easily calculate the derivative of the eigenvalue analytically. However, this gets a lot harder when the system gets larger.

To test autodiff I made the following function.

using ForwardDiff
get_eig_1(a, b, c) = (-b+sqrt(Complex(b^2-4*a*c)))/(2*a)
f(x) = get_eig_1(1, a11, x*a_21) 
ForwardDiff.derivative(f, 0.7)

This does not work and from what I can tell it is because the function get_eig_1 returns a complex number. For this simple problem I guess autodiff does not make sense, but I am trying to understand what is going on and why it doesn’t work before trying something more difficult.

Is there a way to make autodifferentiation work for the function get_eig_1 and will it still be possible when the system matrix gets large?

Cheers Sigurd

antoine-levitt · April 26, 2022, 1:42pm

In this case it’s just an API problem: ForwardDiff.derivative(x -> real(f(x)), 0.7) works fine, and similarly for the imaginary part. There are open issues related to complex numbers on the forwarddiff issue tracker, but I don’t know if this one is covered.

Autodiff for eigenvalues of symmetric matrices should just work out of the box with forwarddiff since eigen and eigvals for Symmetric, Hermitian, and SymTridiagonal by dlfivefifty · Pull Request #513 · JuliaDiff/ForwardDiff.jl · GitHub. For non-symmetric you probably have to roll your own (and contribute it back!). It should work fine for large problems (as long as you can diagonalize the matrix of course, so not too large…)

Hofsmo · April 26, 2022, 2:17pm

Thanks, for the quick reply. It seems too work .

The system matrix A will in most cases not be symmetric. I guess then my approach would be to find a suitable method for calculaing the eigenvalues for my problem. Does it matter if the method is iterative or not?

antoine-levitt · April 29, 2022, 1:37pm

This will work out of the box after Support complex output in `derivative` by antoine-levitt · Pull Request #583 · JuliaDiff/ForwardDiff.jl · GitHub

will in most cases not be symmetric. I guess then my approach would be to find a suitable method for calculaing the eigenvalues for my problem. Does it matter if the method is iterative or not?

If your method is iterative, you cannot use the work above (since that assumes a full eigendecomposition). In that case you need to solve a linear system for the sensitivities (in quantum mechanics it’s sometimes known as a Sternheimer equation, I would imagine it has other names in other communities). In the symmetric case and for a single well-separated eigenpair (x,lambda), it looks like Q(A-lambda)Q dx = -Q dA x, with Q the orthogonal projector onto the orthogonal of span(x).

stevengj · April 29, 2022, 5:07pm

See e.g. section 4 of my course notes on adjoint methods for a tutorial introduction. (I hadn’t heard the term “Sternheimer equation” before, I’ll have to mention this in a future revision of the notes!)

However, since @Hofsmo only wants the derivative of the eigenvalue, not the eigenvector, note that in this case you not need to solve a linear system. If you have a (simple) eigenpair Ax = \lambda x and a corresponding left eigenvector A^T y = \lambda y (y = x for a symmetric A), then the derivative of the eigenvalue with respect to any parameter p of the matrix A(p) is:

\frac{\partial \lambda}{\partial p} = \frac{y^T \frac{\partial A}{\partial p} x}{y^T x}

(In physics, in the Hermitian case where y = \bar{x}, this is called the Hellmann–Feynman theorem.) So, given a “forward” solver to find x,y,\lambda (e.g. by Arnoldi or some other iterative method), the derivatives are just some dot products.

Hofsmo · May 2, 2022, 6:30am

Thanks for the good answers.

I just found the answer mentioned by stevengj in a book and it seems like it will solve my problem. In any case the answers realted to autodiff were just as interesting. I didn’t just want to solve my problem, but I also wanted to understand why it didn’t work with autodiff.

Thanks to both of you

Topic		Replies	Views
Native eigenvals for differentiable programming Machine Learning	23	4878	February 13, 2023
Computing matrix derivatives for eigenvaule sensitivity analysis Numerics autodiff , enzyme	6	351	April 12, 2024
Derivative of eigenvalues and eigenvectors of Hermitian Matrix by automatic differentiation General Usage question	19	4511	June 11, 2018
Automatic differentiation of f(x) constructed by decomposition of a matrix M(x) Optimization (Mathematical) differentiation , flux , machine-learning , numerics	7	2867	January 8, 2020
Forward mode Autodiff for Matrix multiplication General Usage autodiff	3	781	November 25, 2021

Eigenvalue sensitivites using autodiff

Related topics