Hi,
I am quite new to both julia and autodiff and was wondering if someone can point me in the right direction. I have a dynamic system described using state space representation. I want to calculate the derivative of the eigenvalues of the system matrix A with respect to one parameter a_{12}.
The system matrix can have many states, but for testing I made a small matrix A=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & 0\end{bmatrix}. For this system I can easily calculate the derivative of the eigenvalue analytically. However, this gets a lot harder when the system gets larger.
To test autodiff I made the following function.
using ForwardDiff
get_eig_1(a, b, c) = (-b+sqrt(Complex(b^2-4*a*c)))/(2*a)
f(x) = get_eig_1(1, a11, x*a_21)
ForwardDiff.derivative(f, 0.7)
This does not work and from what I can tell it is because the function get_eig_1 returns a complex number. For this simple problem I guess autodiff does not make sense, but I am trying to understand what is going on and why it doesn’t work before trying something more difficult.
Is there a way to make autodifferentiation work for the function get_eig_1 and will it still be possible when the system matrix gets large?
In this case it’s just an API problem: ForwardDiff.derivative(x -> real(f(x)), 0.7) works fine, and similarly for the imaginary part. There are open issues related to complex numbers on the forwarddiff issue tracker, but I don’t know if this one is covered.
The system matrix A will in most cases not be symmetric. I guess then my approach would be to find a suitable method for calculaing the eigenvalues for my problem. Does it matter if the method is iterative or not?
will in most cases not be symmetric. I guess then my approach would be to find a suitable method for calculaing the eigenvalues for my problem. Does it matter if the method is iterative or not?
If your method is iterative, you cannot use the work above (since that assumes a full eigendecomposition). In that case you need to solve a linear system for the sensitivities (in quantum mechanics it’s sometimes known as a Sternheimer equation, I would imagine it has other names in other communities). In the symmetric case and for a single well-separated eigenpair (x,lambda), it looks like Q(A-lambda)Q dx = -Q dA x, with Q the orthogonal projector onto the orthogonal of span(x).
See e.g. section 4 of my course notes on adjoint methods for a tutorial introduction. (I hadn’t heard the term “Sternheimer equation” before, I’ll have to mention this in a future revision of the notes!)
However, since @Hofsmo only wants the derivative of the eigenvalue, not the eigenvector, note that in this case you not need to solve a linear system. If you have a (simple) eigenpair Ax = \lambda x and a corresponding left eigenvector A^T y = \lambda y (y = x for a symmetric A), then the derivative of the eigenvalue with respect to any parameter p of the matrix A(p) is:
(In physics, in the Hermitian case where y = \bar{x}, this is called the Hellmann–Feynman theorem.) So, given a “forward” solver to find x,y,\lambda (e.g. by Arnoldi or some other iterative method), the derivatives are just some dot products.
I just found the answer mentioned by stevengj in a book and it seems like it will solve my problem. In any case the answers realted to autodiff were just as interesting. I didn’t just want to solve my problem, but I also wanted to understand why it didn’t work with autodiff.