Problems with complex values in optimization objective (JuMP)

When using complex valued numbers in the objective function, get an error that I struggle to understand. This minimal example illustrates it very well. Let’s say we want to find parameters of a parametrised unitay matrix such that it agrees with a given unitary.

using LinearAlgebra
using JuMP
import HiGHS

const X = ComplexF64[0 1; 1 0]
const Y = ComplexF64[0 -im; im 0]
const Z = ComplexF64[1 0; 0 -1]

rot(op, angle) = exp(-im * angle / 2 * op)

function unitary(theta)
    return exp(-i * theta[3] * kron(Z, X)) + exp(-i * theta[2] * kron(Z, Y)) * exp(-i * theta[1] * kron(Z, X))
end

function opt(theta)
    u_target = exp(-im * pi / 4 * kron(Z, X))
    u = u_lstar(theta)
    fid = abs(tr(u_target' * u))^2 / 4
    return fid
end

model = Model(HiGHS.Optimizer)
@variable(model, -pi <= theta[1:3] <= pi)
@objective(model, Max, opt(theta))
JuMP.optimize!(model)
value(theta)

It gives the following error:

ERROR: Cannot build `GenericNonlinearExpr` because a term is complex-valued: `(0)::GenericAffExpr{ComplexF64, VariableRef}`
Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:44
  [2] _throw_if_not_real(x::GenericAffExpr{ComplexF64, VariableRef})
    @ JuMP C:\Users\xyz\.julia\packages\JuMP\7eD71\src\nlp_expr.jl:344
  [3] /(x::GenericAffExpr{ComplexF64, VariableRef}, y::NonlinearExpr)
    @ JuMP C:\Users\xyz\.julia\packages\JuMP\7eD71\src\nlp_expr.jl:401
  [4] eigtype(T::Type)
    @ LinearAlgebra C:\Users\xyz\.julia\juliaup\julia-1.12.4+0.x64.w64.mingw32\share\julia\stdlib\v1.12\LinearAlgebra\src\eigen.jl:319
  [5] exp(A::Matrix{GenericAffExpr{ComplexF64, VariableRef}})
    @ LinearAlgebra C:\Users\xyz\.julia\juliaup\julia-1.12.4+0.x64.w64.mingw32\share\julia\stdlib\v1.12\LinearAlgebra\src\dense.jl:671
...

The value returned by the optimization objective (and everything else except for the matrix product) is real, so I struggle to understand what causes the problem. Is it not allowed to use complex numbers altogether?

Hi @dnldlg, complex numbers may not appear in nonlinear expressions. I’ll update the documentation and the error message to make this clearer.

I’m away from my computer for the weekend, but I’ll see if there’s a way to rewrite your problem next week.

1 Like

Hi @dnldlg, what is u_lstar? Also unitary refers to i which isn’t defined.

Do you have a minimal reproducible example?

1 Like

Apologies, I was not using a clean REPL while writing this minimal example so I didn’t catch these errors. Here is a corrected version.

using LinearAlgebra
using JuMP
import HiGHS

const X = ComplexF64[0 1; 1 0]
const Y = ComplexF64[0 -im; im 0]
const Z = ComplexF64[1 0; 0 -1]

function unitary(theta)
    return exp(-im * theta[3] * kron(Z, X)) + exp(-im * theta[2] * kron(Z, Y)) * exp(-im * theta[1] * kron(Z, X))
end

function opt(theta)
    u_target = exp(-im * pi / 4 * kron(Z, X))
    u = unitary(theta)
    fid = abs(tr(u_target' * u))^2 / 4
    return fid
end

model = Model(HiGHS.Optimizer)
@variable(model, -pi <= theta[1:3] <= pi)
@objective(model, Max, opt(theta))
JuMP.optimize!(model)
value(theta)

I think this package might not be suitable for me if no complex operations can be performed in the optimization. While one could rewrite this specific problem with orthogonal matrices, rewriting my “real” problem seems unfeasible — or at least too tedious.

Yeah if you don’t want to rewrite, JuMP is the wrong tool for the job.

See Should you use JuMP? · JuMP

If you just provide your own gradient it should be fine, I think. (Pass a grad_f! to @operator to construct a user-defined operator with a user-defined gradient. Then JuMP won’t “look inside” it and won’t care that you use complex numbers internally.)

(Not all the AD packages support differentiating through complex functions, but some of them do, e.g. Zygote.jl and Enzyme.jl. Of course, you can also just differentiate it manually. Regardless, you might want to learn a little about CR calculus to understand what is going on — for a gentle introduction, see e.g. chapter 15 of our Matrix Calculus course notes and references therein.)

Of course, once you have the gradient you can easily call other optimization packages, such as Optimization.jl, NLopt.jl, Nonconvex.jl, and Ipopt.jl.

For a three-parameter function, it’s also not crazy to do derivative-free optimization, which is a lot simpler to use (though it doesn’t scale to lots of parameters).

PS. Be careful with this. It may seem reasonable, but it can create artficial local minima at the periodic \theta boundary, where it e.g. gets stuck at \theta=\pi and can’t go downhill to some \theta = -\pi + \delta. You can just leave \theta unconstrained and take it mod 2π at the end, for example.

2 Likes