Best practices for optimization problems with ODE solves and eigen calls

I am working on an optimization which has the following structure

  1. Build some Hamiltonian large matrices which are functions of the parameters I am optimizing over
  2. Project onto the lowest couple of energy levels
  3. Calculate some propagators (basically solve an ODE)
  4. Calculate some cost function based off the results

I am wondering if there are any recommendations for how do this well! Currently I am using SciML Optimization.jl with NelderMead but have also had some slightly better success with NonlinearSolve’s LevenbergMarquardt + AutoFiniteDiff.

I think the main questions are:

  1. Has anyone run into similar optimization problems and if so, what works? What are some other solvers / techniques that might be good for this type of problem.
  2. Ideally, I would be able to use Forward or Reverse autodiff. However, this does not work because of the eigen decomposition. Is there any way to get this to work?

Hi, could you provide a minimal example of the differentiation failures you’re running into? Ideally with vaguely realistic problem sizes and sparsities

Maybe it would be helpful to provide a bit more of the background. Perhaps there are some packages avalaible already that solve your problem without having to reinvent the wheel :slight_smile:

Right now it sounds a bit like you want to solve some quantum optimal control problem. Now this is not my area of expertise but I know there is some ecosystem around QuantumControl.jl in this direction.

ForwardDiff can handle eigendecompositions if you use GenericLinearAlgebra. Or at least it could on 0.10, it’s kind of broken on 1.x.

So, the minimal not working model is:

function test(x)
    return eigen(x)
end
x = [[0,1] [1,0]]

ForwardDiff.gradient(test, x0)

and the error is

ERROR: MethodError: no method matching eigen!(::Matrix{ForwardDiff.Dual{…}}; permute::Bool, scale::Bool, sortby::typeof(LinearAlgebra.eigsortby))
The function `eigen!` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  eigen!(::AbstractMatrix, ::Cholesky; sortby) got unsupported keyword arguments "permute", "scale"
   @ LinearAlgebra ~/.julia/juliaup/julia-1.12.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.12/LinearAlgebra/src/symmetriceigen.jl:265
  eigen!(::StridedMatrix{T}, ::LU{T, <:StridedMatrix{T} where T}; sortby) where T got unsupported keyword arguments "permute", "scale"
   @ LinearAlgebra ~/.julia/juliaup/julia-1.12.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.12/LinearAlgebra/src/symmetriceigen.jl:300
  eigen!(::Hermitian{T, <:BandedMatrices.BandedMatrix{T}}, ::Hermitian{T, <:BandedMatrices.BandedMatrix{T}}) where T<:Union{ComplexF64, ComplexF32} got unsupported keyword arguments "permute", "scale", "sortby"
   @ BandedMatrices ~/.julia/packages/BandedMatrices/GLf5L/src/symbanded/bandedeigen.jl:178
  ...

Maybe there is a simple change to eigen or the derivative rule that will fix this?

@abraemer, the problem is related to quantum control, however the big issue here is how the states are constructed. I have some coupled superconducting circuits and while the dynamics is only on the low energy states of my system, I need to extract these states by individually diagonalizing each transmon (which are functions of the optimization parameters) and then project onto the low energy states. This diagonalization step is the one that I am having issues using autodiff with. Generally optimal control schemes don’t have this step.

As I said, just use GenericLinearAlgebra:

import ForwardDiff
using LinearAlgebra
using GenericLinearAlgebra

function test(x)
    eigen(x).vectors[1]
end

x0 = [[0,1] [1,0]]

ForwardDiff.gradient(test, x0)

Note that you need some scalar function of eigen or eigvals, as ForwardDiff only takes gradients of scalar functions.

This gives the same error as above.

You need ForwardDiff 0.10, the new version is buggy.

Ah! My bad. Yes, that does work, I will take a look at their open issues and see if this is mentioned and open one if not. How reliable is 0.10?

It’s known how to find the dependence of an eigenvector and eigenvalue on a single parameter. It’s in Golub and Van Loan, among many other places. Starting from Ax=\lambda x, differentiate both sides with respect to your parameter and get 4 terms from the product rules. Take the inner product against x in the symmetric case (or a right eigenvector more generally), and two of those cancel out. This lets you solve for \dot{\lambda}, and then the 4-term expression can be used to find \dot{x}.

I presume you could let the parameter equal each matrix element in turn to build up the gradient. I suspect that would be a lot faster.

I’m not sure it’s quire right to describe the new (or the old) behavior as buggy. The new behavior is just a slightly different derivative definition.

I’m only aware of two open issues about eigen:

  1. performance regression in x1.x? with GenericLinearAlgebra/sqrt or eigen · Issue #780 · JuliaDiff/ForwardDiff.jl · GitHub
  2. Regression in 1.0, asking for Hessian of a complex matrix function results in infinite loop · Issue #756 · JuliaDiff/ForwardDiff.jl · GitHub

As far as I can tell yours is independent.

As for 0.10, it’s rock solid. I’ve been using it for years without any problems.

There is no ambiguity or controversy about the definition of derivative for eigenvalues and eigenvectors. The old version gives the correct answer, the new one doesn’t. That’s a bug and a regression.