I have a problem where I would like to apply eigenvalue sensitivity analysis, using Enzyme to compute the derivatives of the matrix A(\mathbf{p}), where \mathbf{p} = (p_1,\ldots,p_n). Per @stevengj’s notes, if I want the derivative of an eigenvalue, then for a system
My thought is that since I don’t actually care about the derivatives of the matrix, only their product with the eigenvectors, that I should avoid explicitly constructing that object. I would also like to hand Enzyme a structure that contains all of the parameters and get all of the partials \partial \lambda/\partial p_1, \partial \lambda/\partial p_2, \ldots at once.
(Edited: I had read the first half of your post and started to write out exactly what you suggested.)
Note that ChainRules.jl includes rules for eigenvalue differentiation, if I recall correctly. But they are based on a dense-matrix eigen routine, so they aren’t appropriate if you are computing just one eigenvalue (and corresponding eigenvalues), e.g. by some Krylov method.
Thanks for confirming and for pointing to ChainRules.jl.
The application will be using sparse matrices and Krylov methods—and it’s actually a generalized EVP. Eventually, the target metric will be a function of the eigenvectors. So, I’ll have to solve the adjoint problem. But, I think that the same sort of trick as above should work for the AD portion.
(But for very large problems you still might want to do it manually to have more control over how the adjoint linear systems are solved. In any case, even if you are using AD it is nice to have some idea of how things work under the hood so that you can use AD effectively, and to know where it might run into trouble.)
Indeed, and ImplicitDifferentiation allows you to plug in an arbitrary linear solver, either dense or iterative (lazy). It’s currently a bit cumbersome but I plan to make it much simpler in the next release. You can also pick a different AD backend to differentiate through the conditions than the one doing the outer differentiation, which may come in handy