As a follow up to my NLP modeling survey post, I am please to announce the release of the Rosetta OPF project. The goal of this project is to highlight the basic technical requirements for solving the kinds of continuous non-convex mathematical optimization problems that I regularly encounter. To that end, Rosetta OPF provides example implementations of the AC Optimal Power Flow (AC-OPF) problem in five of Julia’s nonlinear optimization modeling layers. Based on the findings of the NLP modeling survey, Rosetta OPF currently includes AC-OPF implementations in GalacticOptim, JuMP, ADNLPModels, Nonconvex, Optim. If I have missed any applicable optimization framework, please let me know. Other implementations are welcome!
At this time I would appreciate feedback and implementation improvement suggestions from the package developers. Below is a brief report on each implementation (in alphabetical order) and some specific inquiries for the developers.
GalacticOptim (@ChrisRackauckas,@Vaibhavdixit02): This implementation is working with GalacticOptim#master / ForwardDiff v0.10.25 and is converging to the expected solution. The primary limitation is that the AD system I am using does not take advantage of the problem’s sparsity. Suggestions for improvements are welcome.
JuMP (@odow,@miles.lubin): This implementation is working with JuMP v0.23.2 / Ipopt v1.0.2 and scales to large problem sizes. Suggestions for implementation improvements are welcome.
NLPModels (@abelsiqueira): This implementation is working with the releases ADNLPModels v0.3.1 / NLPModelsIpopt v0.9.1. The primary limitation is that the AD system does not take advantage of the problem’s sparsity. Suggestions for improvements are welcome.
Nonconvex (@mohamed82008): So far I have not been able to get a basic implementation working. This is in part due to the Zygote issue but also the problem information from Ipopt seems off (e.g., it reports 0 terms in the Jacobian and Hessian). I have also noticed that there are some differences in the solver traces between Model and DictModel in a small test problem. If you could make a PR with a working implementation that would be much appreciated.
Optim (@pkofod): It is difficult for me to verify if this implementation is correct or not (tested with Optim v1.6.2). At first glance the IPNewton algorithm seems to work and converges. However, when I check the details of the converged point it has significant constraint violations. I did verify that if I provide IPNewton with a near-optimal starting point it will converge with minimal constraint violations. This makes me suspect that the algorithm is the issue and not the model’s implementation. I also tried testing with LBFGS and NelderMead but they yielded stack overflow errors. In any case, please let me know if there are any obvious changes I should make or other directions to investigate.
We have the capability to do that with ModelingToolkit as the AD backend but the implementation has a few bugs, I had a branch to fix it up but I forgot about it, let me get that ready and I’ll ping you so you can try it out and see if it satisfies your requirements.
Doing the constraint Jacobians in the ForwardDiff mode with SparseDiffTools.jl would be easy, so we should do that first. Hessians need Hessian support in SparseDiffTools.jl, so that would take a bit more. And as @Vaibhavdixit02 said, the MTK form can do all of the sparsity pretty easily, so we just need to merge that. We’ll make sure to finish it now given someone is interested.
While some packages are getting debugged and upgraded, I thought it would be good to share some preliminary results so we can track progress after the proposed fixes / improvements are made. To that end, below is a table of results for each of the Rosetta-OPF implementations on 59 standard AC-OPF benchmark instances.
I checked the solution quality and, when a converged solution is found, it is the same (up to 5 decimal places) on of all methods considered. Hence, the most interesting distinctions between the implementations are the run times. The following table shows the solver run times (in seconds) for each of the 59 problem instances. All of the solvers were given a wall-clock time limit of 7200 seconds (i.e., 2 hours). A value of N.D. (No Data) indicates that the implementation failed to output the result data, this is usually due to solver getting cut off by the runtime limit before converging. A value of INF. (Infeasible) indicates that the optimization terminated, but the solution failed to meet a constraint feasibility criteria (e.g. individual constraint violations that are greater than 1e-6). Note, no data is available yet for NonConvex as the current implementation is not operational, see the previous discussion for details.
The non-JuMP implementations have plenty of room for improvement. In the interest of modeling flexibility, they allocate quite a lot and use a bunch of dictionary look-ups.
We also have some results using Symbolics.jl to compute symbolic derivatives which shows that SymbolicAD.jl is 3-5x faster at computing derivatives than JuMP’s AD for AC-OPF. It also fixes JuMP’s INF. numerical issue in case2853_sdet.
Do you use AD for Jacobian and Hessian computations? these matrices tend to be very sparse, but I do not know if an AD tool can leverage the sparsity structure. For rectangular coordinate PF I ended up building Jacobian and Hessian oracles as they are orders of magnitude faster than AD.
Two clarifications on the power systems side. In this work we are focusing on AC-OPF (the optimization variant of AC Power Flow), for AC-PF I agree hard coding the Jacobian is the way to go, but in this context you do not usually need the sparse Hessian, in my experience.
We are also studying the variant of the problem in polar voltage coordinates (rather than rectangular) because it presents a more general case for the AD system (the Hessian is constant for rectangular voltage case). JuMP does a nice job in the rectangular voltage case in my experience as well you just need to make sure to use @constraint instead of @NLconstraint to get the faster hessian computation.
I tried on the 2383 case (I cannot go higher because of memory), and the result are very good! Now I think my derivatives must be doing something wrong for using so much memory and time, specially because the amount of allocations is still way lower.
Full benchmark (SymbolicAD):
0.483459 seconds (3.50 M allocations: 206.808 MiB, 19.78% gc time)
Hessian benchmark (formula):
5.110287 seconds (392.95 k allocations: 2.945 GiB, 16.31% gc time)
Jacobian benchmark (formula):
3.133732 seconds (37.68 k allocations: 1.309 GiB, 41.19% gc time)
For my research I need the OPF Jacobians and Hessians in rectangular coordinates. As you said, the AD should have an easir job as the Hessians are constant, yet both ForwardDiff and SparseDiffTools took much more time than hard-coded derivatives in my tests. It looks to me that an appropriate choice of AD library is critical, as on the other hand SymbolicAD seems to fare well.
Per @ccoffrin’s comment, is this for PF or OPF? For the 500_goc, I read that as we’re computing the jacobian in 1e-3 and the formula is 3e-2, so something is off.
I don’t know if SymbolicAD computes the Hessians or not
I always got this type of error:
To clarify my suggestion, you shouldn’t run the run_unit_benchmark directly, because it looks the problem multiple times and checks the solution against JuMP (which is the failure, JuMP’s derivatives can be slightly incorrect to some tolerance).
I was assuming you would copy the relevant parts of the function for your own purposes.