Native eigenvals for differentiable programming

You could always implement the derivative yourself. The adjoint formula for differentiating eigenvalues with respect to changes in the matrix is pretty simple; see https://github.com/mitmath/18335/blob/spring21/notes/adjoint/adjoint.pdf (it’s equivalent to the Hellmann–Feynman theorem); differentiating scalar functions of the eigenvectors is a bit more complicated but not too bad. (My notes show the case of Hermitian matrices; the extension to non-Hermitian problems is similar but involves left and right eigenvectors — IIRC, every xᵀ is replaced with the left eigenvector.)

One snag is that eigenvalues cease to be differentiable when two eigenvalues cross. (And eigenvector methods of non-Hermitian problems can become ill-conditioned near crossings.) Crossings may seem unlikely, but eigenvalue-optimization problems often push eigenvalues towards crossings (especially in Hermitian problems). There is a way around it using generalized gradients, as I review in my notes here: https://github.com/mitmath/18335/blob/spring21/notes/adjoint/eigenvalue-adjoint.pdf… but usually in my own work I’ve found that’s it’s possible to reformulate ostensibly eigenvalue-optimization problems in ways that avoid explicit eigenvalue computations. For example, in terms of SDPs or in terms of resolvent operators.

You really don’t want to apply automatic differentiation blindly to an iterative solver. Differentiating a recurrence relationship like this is much more expensive than differentiating the final result.

For the same reason, a Julia-native eigensolver wouldn’t help here. Applying AD to an eigenvalue algorithm like QR is a non-starter. (All eigenvalue algorithms for matrices bigger than 4×4 are iterative, thanks to the Abel–Ruffini theorem.)

Automatic differentiation is a great tool, but at some point you should also learn how to compute derivatives yourself, if only to understand how to use these tools effectively. Moreover, once you get to a complicated enough problem that involves external libraries, AD tools quickly hit a wall — I’ve never been able to use AD for a non-trivial problem in PDE-constrained optimization.

20 Likes