First things first, you always have to try different tolerances and I don’t see that here. Is this still an issue at lower tolerances?
Very good point! I didn’t realize that the default tolerances were pretty high. Increasing the precision definitely has a significant effect. If I set reltol = 1e-9, abstol = 1e-7
, I get the much saner
which stays the same if I continue increasing the precision all the way to 1e-14
.
There’s still the non-monotonic convergence, but I think what is happening is that the optimizer is running into the box constraints. I don’t know if NLOpt.LBFGS
is supposed to behave like that (I don’t know how it actually takes into account box constraints), but I suppose this could be correct. It would be nice to also have the option to use LBFSGB to double check since that has the bounds built into the method, and I don’t think it would do that kind of non-monotonic convergence.
Next, did you try an Enzyme version?
Yeah OptimizationFunction((x, _)->loss_zygote(x), AutoEnzyme())
fails with
ERROR: Enzyme execution failed.
Enzyme: Not yet implemented augmented forward for jl_f__apply_iterate (true, true, iterate, Core.apply_type, 7, 6)
after a lot of Warning: TypeAnalysisDepthLimit
.
If both of those have issues, could you try and isolate this to an MWE on just gradients of complex solves on a solve?
You mean a simple state-to-state optimization like this?
function loss_simple(x)
Ψ₀ = ComplexF64[1, 0, 0]
Ψtgt = ComplexF64[0, 0, 1]
tspan = (0.0, 1.0)
prob = ODEProblem(f₋, Ψ₀, tspan, x)
Ψ = OrdinaryDiffEq.solve(prob, DP5(), verbose=false, reltol = 1e-9, abstol = 1e-7).u[end]
fid = abs2(Ψ ⋅ Ψtgt)
return 1 - fid
end
That gives me a very crazy / interesting
where every other iteration, the fidelity drops back to exactly zero. Turns out the reason for that is that the optimizer really wants to push the first control parameter (the duration of the first sub-pulse) to the lower bound of zero, and then the system just doesn’t evolve at all and the resulting fidelity is exactly zero. So, it seems like a good idea would be to put the lower bound not at zero, but at, e.g. 0.1
. That helps a lot, and produces
Still the non-monotonic dips, but those again are where the optimizer pushes against the constraints.
So I still can’t tell for absolutely sure that the gradients are good, but it seems to me that with the increased precision, it’s probably fine and it’s more the optimizer that’s being a bit wonky (to my taste). I’ll be able to tell for sure once I put in a bit more work in QuantumGradientGenerators to extend it to parameterized control fields (it’s basically a very specialized version of forward-mode AD),
In the meantime, it would also be nice to have LBFSGB in Optimization.jl
(#277), because I think that’ll be less “wonky”
Oh, one more thing: OptimizationFunction((x, _)->loss_simple(x), AutoFiniteDiff())
behaves substantially differently (it gets stuck for the more complicated control problem and finds a reasonable but different solution for the simple problem). So, clearly FiniteDifferences
gives quite different gradients than Zygote
. In the past, I’ve used FiniteDifferences
to check Zygote
for pretty simple functions, and they’ve always matched up pretty well. Is the application of FiniteDifferences
to an entire ODE solve
asking too much of it?