AC Optimal Power Flow in Various Nonlinear Optimization Frameworks

As a follow up to my NLP modeling survey post, I am please to announce the release of the Rosetta OPF project. The goal of this project is to highlight the basic technical requirements for solving the kinds of continuous non-convex mathematical optimization problems that I regularly encounter. To that end, Rosetta OPF provides example implementations of the AC Optimal Power Flow (AC-OPF) problem in five of Julia’s nonlinear optimization modeling layers. Based on the findings of the NLP modeling survey, Rosetta OPF currently includes AC-OPF implementations in GalacticOptim, JuMP, ADNLPModels, Nonconvex, Optim. If I have missed any applicable optimization framework, please let me know. Other implementations are welcome!

At this time I would appreciate feedback and implementation improvement suggestions from the package developers. Below is a brief report on each implementation (in alphabetical order) and some specific inquiries for the developers.

GalacticOptim (@ChrisRackauckas,@Vaibhavdixit02): This implementation is working with GalacticOptim#master / ForwardDiff v0.10.25 and is converging to the expected solution. The primary limitation is that the AD system I am using does not take advantage of the problem’s sparsity. Suggestions for improvements are welcome.

JuMP (@odow,@miles.lubin): This implementation is working with JuMP v0.23.2 / Ipopt v1.0.2 and scales to large problem sizes. Suggestions for implementation improvements are welcome.

NLPModels (@abelsiqueira): This implementation is working with the releases ADNLPModels v0.3.1 / NLPModelsIpopt v0.9.1. The primary limitation is that the AD system does not take advantage of the problem’s sparsity. Suggestions for improvements are welcome.

Nonconvex (@mohamed82008): So far I have not been able to get a basic implementation working. This is in part due to the Zygote issue but also the problem information from Ipopt seems off (e.g., it reports 0 terms in the Jacobian and Hessian). I have also noticed that there are some differences in the solver traces between Model and DictModel in a small test problem. If you could make a PR with a working implementation that would be much appreciated.

Optim (@pkofod): It is difficult for me to verify if this implementation is correct or not (tested with Optim v1.6.2). At first glance the IPNewton algorithm seems to work and converges. However, when I check the details of the converged point it has significant constraint violations. I did verify that if I provide IPNewton with a near-optimal starting point it will converge with minimal constraint violations. This makes me suspect that the algorithm is the issue and not the model’s implementation. I also tried testing with LBFGS and NelderMead but they yielded stack overflow errors. In any case, please let me know if there are any obvious changes I should make or other directions to investigate.

15 Likes

You forgot to add constraints :slight_smile: I will look into the Zygote issue. Nonconvex uses l-BFGS Ipopt by default which explains the Hessian.

What are you looking to use sparsity for? The constraint Jacobian?

Both the objective function and the constraints are very sparse. So I would say both are important.

We have the capability to do that with ModelingToolkit as the AD backend but the implementation has a few bugs, I had a branch to fix it up but I forgot about it, let me get that ready and I’ll ping you so you can try it out and see if it satisfies your requirements.

1 Like

Doing the constraint Jacobians in the ForwardDiff mode with SparseDiffTools.jl would be easy, so we should do that first. Hessians need Hessian support in SparseDiffTools.jl, so that would take a bit more. And as @Vaibhavdixit02 said, the MTK form can do all of the sparsity pretty easily, so we just need to merge that. We’ll make sure to finish it now given someone is interested.

While some packages are getting debugged and upgraded, I thought it would be good to share some preliminary results so we can track progress after the proposed fixes / improvements are made. To that end, below is a table of results for each of the Rosetta-OPF implementations on 59 standard AC-OPF benchmark instances.

I checked the solution quality and, when a converged solution is found, it is the same (up to 5 decimal places) on of all methods considered. Hence, the most interesting distinctions between the implementations are the run times. The following table shows the solver run times (in seconds) for each of the 59 problem instances. All of the solvers were given a wall-clock time limit of 7200 seconds (i.e., 2 hours). A value of N.D. (No Data) indicates that the implementation failed to output the result data, this is usually due to solver getting cut off by the runtime limit before converging. A value of INF. (Infeasible) indicates that the optimization terminated, but the solution failed to meet a constraint feasibility criteria (e.g. individual constraint violations that are greater than 1e-6). Note, no data is available yet for NonConvex as the current implementation is not operational, see the previous discussion for details.

Case Vars Consts GalacticOptim JuMP NLPModels NonConvex Optim
case3_lmbd 24 28 4.19e-01 2.31e-02 2.09e+01 N.D. INF.
case5_pjm 44 53 5.44e+00 3.17e-02 1.98e-01 N.D. INF.
case14_ieee 118 169 2.25e+02 6.10e-01 2.33e+01 N.D. INF.
case24_ieee_rts 266 315 5.46e+03 6.07e-01 4.67e+01 N.D. INF.
case30_ieee 236 348 4.67e+03 6.07e-01 3.78e+01 N.D. N.D.
case30_as 236 348 2.71e+03 5.51e-01 3.27e+01 N.D. INF.
case39_epri 282 401 N.D. 1.47e-01 6.03e+01 N.D. N.D.
case57_ieee 448 675 N.D. 7.30e-01 1.05e+02 N.D. N.D.
case60_c 518 737 N.D. 7.44e-01 2.36e+02 N.D. INF.
case73_ieee_rts 824 987 N.D. 7.60e-01 5.67e+02 N.D. N.D.
case89_pegase 1042 1649 N.D. 1.10e+00 1.74e+03 N.D. N.D.
case118_ieee 1088 1539 N.D. 1.03e+00 1.74e+03 N.D. N.D.
case162_ieee_dtc 1484 2313 N.D. 1.26e+00 4.12e+03 N.D. N.D.
case179_goc 1468 2200 N.D. 1.49e+00 N.D. N.D. N.D.
case200_activ 1456 2116 N.D. 1.01e+00 2.80e+03 N.D. N.D.
case240_pserc 2558 3617 N.D. 7.53e+00 N.D. N.D. N.D.
case300_ieee 2382 3478 N.D. 1.80e+00 N.D. N.D. N.D.
case500_goc 4254 6097 N.D. 3.25e+00 N.D. N.D. N.D.
case588_sdet 4110 5979 N.D. 2.79e+00 N.D. N.D. N.D.
case793_goc 5432 7978 N.D. 3.84e+00 N.D. N.D. N.D.
case1354_pegase 11192 16646 N.D. 1.24e+01 N.D. N.D. N.D.
case1888_rte 14480 21494 N.D. 5.36e+01 N.D. N.D. N.D.
case1951_rte 15018 22075 N.D. 2.92e+01 N.D. N.D. N.D.
case2000_goc 19008 29432 N.D. 1.66e+01 N.D. N.D. N.D.
case2312_goc 17128 25716 N.D. 2.03e+01 N.D. N.D. N.D.
case2383wp_k 17004 25039 N.D. 1.96e+01 N.D. N.D. N.D.
case2736sp_k 19088 28356 N.D. 1.62e+01 N.D. N.D. N.D.
case2737sop_k 18988 28358 N.D. 1.31e+01 N.D. N.D. N.D.
case2742_goc 24540 38196 N.D. 6.79e+01 N.D. N.D. N.D.
case2746wp_k 19520 28446 N.D. 1.52e+01 N.D. N.D. N.D.
case2746wop_k 19582 28642 N.D. 1.55e+01 N.D. N.D. N.D.
case2848_rte 21822 32129 N.D. 3.84e+01 N.D. N.D. N.D.
case2853_sdet 23028 33154 N.D. INF. N.D. N.D. N.D.
case2868_rte 22090 32393 N.D. 4.09e+01 N.D. N.D. N.D.
case2869_pegase 25086 37813 N.D. 4.94e+01 N.D. N.D. N.D.
case3012wp_k 21082 31029 N.D. 2.43e+01 N.D. N.D. N.D.
case3022_goc 23238 34990 N.D. 4.32e+01 N.D. N.D. N.D.
case3120sp_k 21608 32092 N.D. 2.49e+01 N.D. N.D. N.D.
case3375wp_k 24350 35876 N.D. 2.77e+01 N.D. N.D. N.D.
case3970_goc 35270 54428 N.D. 4.10e+01 INF. N.D. N.D.
case4020_goc 36696 56957 N.D. 1.49e+02 INF. N.D. N.D.
case4601_goc 38814 59596 N.D. 5.73e+01 N.D. N.D. N.D.
case4619_goc 42532 66289 N.D. 5.94e+01 N.D. N.D. N.D.
case4661_sdet 34758 51302 N.D. 4.06e+01 INF. N.D. N.D.
case4837_goc 41398 64030 N.D. 6.24e+01 N.D. N.D. N.D.
case4917_goc 37872 56917 N.D. 6.45e+01 N.D. N.D. N.D.
case6468_rte 49734 75937 N.D. 1.55e+02 N.D. N.D. N.D.
case6470_rte 50482 75976 N.D. 9.13e+01 N.D. N.D. N.D.
case6495_rte 50426 76124 N.D. 1.68e+02 N.D. N.D. N.D.
case6515_rte 50546 76290 N.D. 1.31e+02 N.D. N.D. N.D.
case8387_pegase 78748 118702 N.D. 1.28e+02 N.D. N.D. N.D.
case9241_pegase 85568 130826 N.D. 1.53e+02 N.D. N.D. N.D.
case9591_goc 83572 130588 N.D. 2.28e+02 N.D. N.D. N.D.
case10000_goc 76804 112352 N.D. 1.45e+02 N.D. N.D. N.D.
case10480_goc 96750 150874 N.D. 2.24e+02 N.D. N.D. N.D.
case13659_pegase 117370 170588 N.D. 1.95e+02 N.D. N.D. N.D.
case19402_goc 179562 281733 N.D. 4.62e+02 N.D. N.D. N.D.
case24464_goc 203374 313641 N.D. 3.19e+03 N.D. N.D. N.D.
case30000_goc 208624 307752 N.D. 1.54e+03 N.D. N.D. N.D.

Package Details

[54578032] ADNLPModels v0.3.1
[f6369f11] ForwardDiff v0.10.25
[a75be94c] GalacticOptim v2.4.0 https://github.com/SciML/GalacticOptim.jl.git#master
[b6b21f68] Ipopt v1.0.2
[4076af6c] JuMP v0.23.2
[f4238b75] NLPModelsIpopt v0.9.1
[01bcebdf] Nonconvex v1.0.2
[bf347577] NonconvexIpopt v0.1.4
[429524aa] Optim v1.6.2
[c36e90e8] PowerModels v0.19.5

Ipopt was run with the default linear solver, MUMPS 5.4.1.

7 Likes

Perhaps two comments:

  • The non-JuMP implementations have plenty of room for improvement. In the interest of modeling flexibility, they allocate quite a lot and use a bunch of dictionary look-ups.
  • We also have some results using Symbolics.jl to compute symbolic derivatives which shows that SymbolicAD.jl is 3-5x faster at computing derivatives than JuMP’s AD for AC-OPF. It also fixes JuMP’s INF. numerical issue in case2853_sdet.
11 Likes

Do you use AD for Jacobian and Hessian computations? these matrices tend to be very sparse, but I do not know if an AD tool can leverage the sparsity structure. For rectangular coordinate PF I ended up building Jacobian and Hessian oracles as they are orders of magnitude faster than AD.

Check out GitHub - JuliaDiff/SparseDiffTools.jl: Fast jacobian computation through sparsity exploitation and matrix coloring

JuMP and SymbolicAD use sparse Jacobians and hessians.

Orders of magnitude faster compared to what AD system?

For the IEEE118 text case I got the following results:

Hessian benchmark (formula):
  0.007992 seconds (15.01 k allocations: 9.190 MiB)
Hessian benchmark (ForwardDiff):
  4.216270 seconds (767.71 k allocations: 986.110 MiB, 2.12% gc time)
Jacobian benchmark (formula):
  0.001089 seconds (1.21 k allocations: 3.787 MiB)
Jacobian benchmark (ForwardDiff):
  0.102123 seconds (37.81 k allocations: 23.546 MiB)
Jacobian benchmark (SparseDiffTools):
  0.107969 seconds (41.45 k allocations: 24.025 MiB)

The library is as efficient as ForwardDiff.jl, even though I cheated by getting the sparsity pattern from the formula-computed Jacobian.

Compared against ForwardDiff.jl and SparseDiffTools.jl:
Can SymbolicAD.jl be used outside JuMP to compute these matrices directly? That way I may be able to benchmark the library against my results.

Can SymbolicAD.jl be used outside JuMP to compute these matrices directly?

Not exactly, but you can load the IEEE118 case using PowerModels and then compute them.

Here’s where we create a PowerModels.jl model:

The 118 case is here: https://github.com/odow/SymbolicAD.jl/blob/master/test/data/pglib_opf_case118_ieee.m

And here’s the function it calls where we time calling the Jacobian etc.

I’d be interested to see how far off a hard-coded derivative we are in the 30,000 bus case: https://github.com/power-grid-lib/pglib-opf/blob/master/pglib_opf_case30000_goc.m

What is the difference of the maximum of the color vector from the size of the system? I can’t see how computing say 10 directional derivatives compared to 1000 would even be close.

Two clarifications on the power systems side. In this work we are focusing on AC-OPF (the optimization variant of AC Power Flow), for AC-PF I agree hard coding the Jacobian is the way to go, but in this context you do not usually need the sparse Hessian, in my experience.

We are also studying the variant of the problem in polar voltage coordinates (rather than rectangular) because it presents a more general case for the AD system (the Hessian is constant for rectangular voltage case). JuMP does a nice job in the rectangular voltage case in my experience as well you just need to make sure to use @constraint instead of @NLconstraint to get the faster hessian computation.

I tried on the 2383 case (I cannot go higher because of memory), and the result are very good! Now I think my derivatives must be doing something wrong for using so much memory and time, specially because the amount of allocations is still way lower.

Full benchmark (SymbolicAD):
  0.483459 seconds (3.50 M allocations: 206.808 MiB, 19.78% gc time)
Hessian benchmark (formula):
  5.110287 seconds (392.95 k allocations: 2.945 GiB, 16.31% gc time)
Jacobian benchmark (formula):
  3.133732 seconds (37.68 k allocations: 1.309 GiB, 41.19% gc time)

For the IEEE118 case is:

(MaxColor, Size) = (20, 236)

For my research I need the OPF Jacobians and Hessians in rectangular coordinates. As you said, the AD should have an easir job as the Hessians are constant, yet both ForwardDiff and SparseDiffTools took much more time than hard-coded derivatives in my tests. It looks to me that an appropriate choice of AD library is critical, as on the other hand SymbolicAD seems to fare well.

Per @ccoffrin’s comment, is this for PF or OPF? For the 500_goc, I read that as we’re computing the jacobian in 1e-3 and the formula is 3e-2, so something is off.

I don’t know if SymbolicAD computes the Hessians or not

Yes.

I always got this type of error:

To clarify my suggestion, you shouldn’t run the run_unit_benchmark directly, because it looks the problem multiple times and checks the solution against JuMP (which is the failure, JuMP’s derivatives can be slightly incorrect to some tolerance).

I was assuming you would copy the relevant parts of the function for your own purposes.

My bad, I updated my results on the previous reply.

1 Like