I am working on an optimization which has the following structure
Build some Hamiltonian large matrices which are functions of the parameters I am optimizing over
Project onto the lowest couple of energy levels
Calculate some propagators (basically solve an ODE)
Calculate some cost function based off the results
I am wondering if there are any recommendations for how do this well! Currently I am using SciML Optimization.jl with NelderMead but have also had some slightly better success with NonlinearSolve’s LevenbergMarquardt + AutoFiniteDiff.
I think the main questions are:
Has anyone run into similar optimization problems and if so, what works? What are some other solvers / techniques that might be good for this type of problem.
Ideally, I would be able to use Forward or Reverse autodiff. However, this does not work because of the eigen decomposition. Is there any way to get this to work?
Hi, could you provide a minimal example of the differentiation failures you’re running into? Ideally with vaguely realistic problem sizes and sparsities
Maybe it would be helpful to provide a bit more of the background. Perhaps there are some packages avalaible already that solve your problem without having to reinvent the wheel
Right now it sounds a bit like you want to solve some quantum optimal control problem. Now this is not my area of expertise but I know there is some ecosystem around QuantumControl.jl in this direction.
function test(x)
return eigen(x)
end
x = [[0,1] [1,0]]
ForwardDiff.gradient(test, x0)
and the error is
ERROR: MethodError: no method matching eigen!(::Matrix{ForwardDiff.Dual{…}}; permute::Bool, scale::Bool, sortby::typeof(LinearAlgebra.eigsortby))
The function `eigen!` exists, but no method is defined for this combination of argument types.
Closest candidates are:
eigen!(::AbstractMatrix, ::Cholesky; sortby) got unsupported keyword arguments "permute", "scale"
@ LinearAlgebra ~/.julia/juliaup/julia-1.12.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.12/LinearAlgebra/src/symmetriceigen.jl:265
eigen!(::StridedMatrix{T}, ::LU{T, <:StridedMatrix{T} where T}; sortby) where T got unsupported keyword arguments "permute", "scale"
@ LinearAlgebra ~/.julia/juliaup/julia-1.12.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.12/LinearAlgebra/src/symmetriceigen.jl:300
eigen!(::Hermitian{T, <:BandedMatrices.BandedMatrix{T}}, ::Hermitian{T, <:BandedMatrices.BandedMatrix{T}}) where T<:Union{ComplexF64, ComplexF32} got unsupported keyword arguments "permute", "scale", "sortby"
@ BandedMatrices ~/.julia/packages/BandedMatrices/GLf5L/src/symbanded/bandedeigen.jl:178
...
Maybe there is a simple change to eigen or the derivative rule that will fix this?
@abraemer, the problem is related to quantum control, however the big issue here is how the states are constructed. I have some coupled superconducting circuits and while the dynamics is only on the low energy states of my system, I need to extract these states by individually diagonalizing each transmon (which are functions of the optimization parameters) and then project onto the low energy states. This diagonalization step is the one that I am having issues using autodiff with. Generally optimal control schemes don’t have this step.
import ForwardDiff
using LinearAlgebra
using GenericLinearAlgebra
function test(x)
eigen(x).vectors[1]
end
x0 = [[0,1] [1,0]]
ForwardDiff.gradient(test, x0)
Note that you need some scalar function of eigen or eigvals, as ForwardDiff only takes gradients of scalar functions.
It’s known how to find the dependence of an eigenvector and eigenvalue on a single parameter. It’s in Golub and Van Loan, among many other places. Starting from Ax=\lambda x, differentiate both sides with respect to your parameter and get 4 terms from the product rules. Take the inner product against x in the symmetric case (or a right eigenvector more generally), and two of those cancel out. This lets you solve for \dot{\lambda}, and then the 4-term expression can be used to find \dot{x}.
I presume you could let the parameter equal each matrix element in turn to build up the gradient. I suspect that would be a lot faster.
There is no ambiguity or controversy about the definition of derivative for eigenvalues and eigenvectors. The old version gives the correct answer, the new one doesn’t. That’s a bug and a regression.