Is your code public?
Yeah thatâs not enough for sparse diff. (Thatâs barely sparse ). Maybe the decompression could be optimized a bit more though.
What do you consider sparse? the number of structural non-zeros is roughly 3% for the IEEE118 case:
(NonZeros, Entries) = (1506, 55696)
And the percentage only goes down for larger cases. For the 2383 case we have:
(NonZeros, Entries) = (30636, 22714756)
(MaxColor, Size) = (20, 4766)
Thatâs roughly 0.15% of structural non-zeros. SymbolicAD
seems to leverage this structure properly though.
Is your code public?
I just uploaded it to Github. Just run test.jl
and thatâs it.
EDIT: I forgot to mention that right now Iâm only computing the PF Hessians and Jacobians. I will compute those terms for the constraints as well, but I havenât implemented that yet.
Perhaps one other clarification. Weâre only computing the jacobian and hessian of nonlinear (i.e., not linear or quadratic) constraints and objectives. We donât compute it for linear or quadratic terms since they get handled by the solver.
Does that include power flow constraints? they are general nonlinear in polar coordinates.
Yes, what @odow is indicating is that these are the only equations that SymbolicAD
is managing for building Jacobian and Hessian. The other parts of the AC-OPF problem are quadratic and have a different interface to the solver due to their simplified Jac/Hess structure.
It has been a couple of months since I made this post and I am planning re-run these test with latest versions of all packages. Please let me know of any specific package improvements that you think could impact this analysis.
CC @ChrisRackauckas, @Vaibhavdixit02, @odow, @miles.lubin, @abelsiqueira, @mohamed82008, @pkofod
The new Optimization.jl sparsity handling is probably worth a try.
https://github.com/SciML/Optimization.jl/pull/265
https://github.com/SciML/Optimization.jl/pull/274
Additionally, if @YingboMa gets around to it then it might be interesting to try tearing
I literally had a draft open with reply to your original post that we added sparsity support in Optimization.jl so you might want to give it a try
But we havenât added documentation for it yet, the way to do it is pass AutoModelingToolkit(true, true)
(the two fields are boolean for enabling sparsity for objective and constraint respectively).
I finished implementing sparsity and symbolic differentiation support in Nonconvex. Your rosetta-opf example works locally. I will make a release and update the docs shortly.
@mohamed82008, I tried with Nonconvex v1.0.3 but I am still getting this issue of zygote crashing, could you elaborate on how I should change these calls to use your symbolic diff approach?
model = Nonconvex.DictModel()
...
x0 = NonconvexCore.getinit(model)
r = Nonconvex.optimize(model, IpoptAlg(), x0)
First you will need Nonconvex v1.0.4. Then you can do:
first_order = true
options = IpoptOptions(; first_order)
sp_model = sparsify(model, hessian = !first_order)
r = Nonconvex.optimize(sp_model, IpoptAlg(), x0; options)
sparsify
will use Symbolics.jl to get the sparsity pattern of the gradients/Jacobians/Hessians and then use SparseDiffTools
+ ForwardDiff
for the differentiation. Alternatively, you can use:
first_order = true
options = IpoptOptions(; first_order)
sym_model = symbolify(model, hessian = !first_order, sparse = true, simplify = true)
r = Nonconvex.optimize(sym_model, IpoptAlg(), x0; options)
which will use a Symbolics
-only approach where the gradients/Jacobians/Hessians are obtained using Symbolics.jl. The sparse
kwarg is used to tell Symbolics to return a sparse matrix/vector for the jacobian/Hessian/gradient.
Note that these are new features so please report any issues in the repo.
As promised, below is a table of updated results using the latest versions of all packages. Note that GalacticOptim has been renamed Optimization.
The substantive changes follow the discussion above. In particular, NonConvex and Optimization have been updated with new AD systems based on Symbolics and now support sparse Jacobian and Hessian computations. For NonConvex, this allowed me to finish building the NonConvex OPF model and confirm that the model is correct on small networks. For Optimization, the new sparse AD system based on ModelingTookit enabled it to solve models with 73 network buses (800+ variables, 900+ constraints), which is a 2x scalability increase from what was possible with the previous dense ForwardDiff approach.
Great thanks to @mohamed82008 @Vaibhavdixit02 @odow for their help in debugging issues as we worked to update the rosetta-opf models.
For general awareness,
- This Zygote issue has thus far blocked using Zygote for AD on these models
- These upstream issues [1][2] appear to be the root cause blocking scaling of the ModelingTookKit approach to larger problems. It looks like it will be resolved in future Julia versions (testing was done on Julia v1.7.2).
Case | Vars | Cons | JuMP | NLPModels | NonConvex | Optim | Optimization |
---|---|---|---|---|---|---|---|
case3_lmbd | 24 | 28 | 1.22e-02 | 2.01e+01 | 6.19e+00 | N.D. | 1.77e+00 |
case5_pjm | 44 | 53 | 1.75e-02 | 1.51e-01 | 4.28e+00 | N.D. | 1.99e-01 |
case14_ieee | 118 | 169 | 6.54e-01 | 2.30e+01 | 8.77e+01 | INF. | 7.98e+00 |
case24_ieee_rts | 266 | 315 | 5.38e-01 | 4.53e+01 | 5.61e+02 | N.D. | 2.14e+01 |
case30_ieee | 236 | 348 | 6.16e-01 | 3.65e+01 | 5.27e+02 | N.D. | 2.43e+01 |
case30_as | 236 | 348 | 6.05e-01 | 3.05e+01 | 4.88e+02 | N.D. | 2.25e+01 |
case39_epri | 282 | 401 | 6.95e-02 | 6.00e+01 | 4.40e+02 | N.D. | 2.95e+01 |
case57_ieee | 448 | 675 | 6.45e-01 | 1.05e+02 | 1.26e+03 | N.D. | 8.19e+01 |
case60_c | 518 | 737 | 6.41e-01 | 2.19e+02 | 2.51e+03 | N.D. | 1.06e+02 |
case73_ieee_rts | 824 | 987 | 6.45e-01 | 5.97e+02 | 5.74e+03 | N.D. | 2.68e+02 |
case89_pegase | 1042 | 1649 | 9.08e-01 | 1.74e+03 | N.D. | N.D. | N.D. |
case118_ieee | 1088 | 1539 | 8.36e-01 | 1.74e+03 | N.D. | N.D. | N.D. |
case162_ieee_dtc | 1484 | 2313 | 8.72e-01 | 4.26e+03 | N.D. | N.D. | N.D. |
case179_goc | 1468 | 2200 | 9.69e-01 | 7.06e+03 | N.D. | N.D. | N.D. |
case200_activ | 1456 | 2116 | 7.73e-01 | 2.80e+03 | N.D. | N.D. | N.D. |
case240_pserc | 2558 | 3617 | 3.09e+00 | N.D. | N.D. | N.D. | N.D. |
case300_ieee | 2382 | 3478 | 1.13e+00 | N.D. | N.D. | N.D. | N.D. |
case500_goc | 4254 | 6097 | 1.69e+00 | N.D. | N.D. | N.D. | N.D. |
case588_sdet | 4110 | 5979 | 1.55e+00 | N.D. | N.D. | N.D. | N.D. |
case793_goc | 5432 | 7978 | 2.01e+00 | N.D. | N.D. | N.D. | N.D. |
case1354_pegase | 11192 | 16646 | 6.82e+00 | N.D. | N.D. | N.D. | N.D. |
case1888_rte | 14480 | 21494 | 2.82e+01 | N.D. | N.D. | N.D. | N.D. |
case1951_rte | 15018 | 22075 | 1.16e+01 | N.D. | N.D. | N.D. | N.D. |
case2000_goc | 19008 | 29432 | 1.03e+01 | N.D. | N.D. | N.D. | N.D. |
case2312_goc | 17128 | 25716 | 8.91e+00 | N.D. | N.D. | N.D. | N.D. |
case2383wp_k | 17004 | 25039 | 9.37e+00 | N.D. | N.D. | N.D. | N.D. |
case2736sp_k | 19088 | 28356 | 7.67e+00 | N.D. | N.D. | N.D. | N.D. |
case2737sop_k | 18988 | 28358 | 8.01e+00 | N.D. | N.D. | N.D. | N.D. |
case2742_goc | 24540 | 38196 | 3.50e+01 | N.D. | N.D. | N.D. | N.D. |
case2746wp_k | 19520 | 28446 | 8.71e+00 | N.D. | N.D. | N.D. | N.D. |
case2746wop_k | 19582 | 28642 | 7.97e+00 | N.D. | N.D. | N.D. | N.D. |
case2848_rte | 21822 | 32129 | 1.76e+01 | N.D. | N.D. | N.D. | N.D. |
case2853_sdet | 23028 | 33154 | 1.26e+01 | N.D. | N.D. | N.D. | N.D. |
case2868_rte | 22090 | 32393 | 2.06e+01 | N.D. | N.D. | N.D. | N.D. |
case2869_pegase | 25086 | 37813 | 1.46e+01 | N.D. | N.D. | N.D. | N.D. |
case3012wp_k | 21082 | 31029 | 1.32e+01 | N.D. | N.D. | N.D. | N.D. |
case3022_goc | 23238 | 34990 | 1.38e+01 | N.D. | N.D. | N.D. | N.D. |
case3120sp_k | 21608 | 32092 | 1.26e+01 | N.D. | N.D. | N.D. | N.D. |
case3375wp_k | 24350 | 35876 | 1.47e+01 | N.D. | N.D. | N.D. | N.D. |
case3970_goc | 35270 | 54428 | 2.32e+01 | INF. | N.D. | N.D. | N.D. |
case4020_goc | 36696 | 56957 | 2.89e+01 | N.D. | N.D. | N.D. | N.D. |
case4601_goc | 38814 | 59596 | 2.84e+01 | N.D. | N.D. | N.D. | N.D. |
case4619_goc | 42532 | 66289 | 2.94e+01 | N.D. | N.D. | N.D. | N.D. |
case4661_sdet | 34758 | 51302 | 2.48e+01 | INF. | N.D. | N.D. | N.D. |
case4837_goc | 41398 | 64030 | 2.62e+01 | N.D. | N.D. | N.D. | N.D. |
case4917_goc | 37872 | 56917 | 2.78e+01 | N.D. | N.D. | N.D. | N.D. |
case6468_rte | 49734 | 75937 | 8.18e+01 | N.D. | N.D. | N.D. | N.D. |
case6470_rte | 50482 | 75976 | 4.56e+01 | N.D. | N.D. | N.D. | N.D. |
case6495_rte | 50426 | 76124 | 8.23e+01 | N.D. | N.D. | N.D. | N.D. |
case6515_rte | 50546 | 76290 | 6.09e+01 | N.D. | N.D. | N.D. | N.D. |
case8387_pegase | 78748 | 118702 | 6.51e+01 | N.D. | N.D. | N.D. | N.D. |
case9241_pegase | 85568 | 130826 | 6.62e+01 | N.D. | N.D. | N.D. | N.D. |
case9591_goc | 83572 | 130588 | 8.49e+01 | N.D. | N.D. | N.D. | N.D. |
case10000_goc | 76804 | 112352 | 5.75e+01 | N.D. | N.D. | N.D. | N.D. |
case10480_goc | 96750 | 150874 | 8.37e+01 | N.D. | N.D. | N.D. | N.D. |
case13659_pegase | 117370 | 170588 | 9.50e+01 | N.D. | N.D. | N.D. | N.D. |
case19402_goc | 179562 | 281733 | 1.93e+02 | N.D. | N.D. | N.D. | N.D. |
case24464_goc | 203374 | 313641 | 1.67e+02 | N.D. | N.D. | N.D. | N.D. |
case30000_goc | 208624 | 307752 | 3.54e+02 | N.D. | N.D. | N.D. | N.D. |
Package Details
[54578032] ADNLPModels v0.3.3
[b6b21f68] Ipopt v1.0.2
[4076af6c] JuMP v1.1.1
[961ee093] ModelingToolkit v8.14.0
[f4238b75] NLPModelsIpopt v0.9.1
[01bcebdf] Nonconvex v2.0.3
[bf347577] NonconvexIpopt v0.4.1
[429524aa] Optim v1.7.0
[7f7a1694] Optimization v3.6.1
[fd9f6733] OptimizationMOI v0.1.1
[c36e90e8] PowerModels v0.19.5
Ipopt was run with the default linear solver, MUMPS 5.4.1.
These upstream issues [1 ][2 ] appear to be the root cause blocking scaling of the ModelingTookKit approach to larger problems. It looks like it will be resolved in future Julia versions (testing was done on Julia v1.7.2).
Are you including the compilation time here? Yeah the RGF issues are one thing, and the other is we need to release SymbolicIR.
MTK also doesnât do structural_simplify
on OptimizationProblem right now, which can be a major win because then it solves a much smaller problem. @YingboMa was working on that.
Are you including the compilation time here?
Good question. In these runtime numbers I try to avoid capturing artifacts from one-time overheads, such as JIT time. In my experience, with sufficient effort, these can be avoided in production deployment settings.
In all of these implementations before I start collecting data solve_opf("$(@__DIR__)/data/pglib_opf_case5_pjm.m")
is executed first, to compile the relevant code execution paths. Then solve_opf(...)
is run again on the file where runtime data is being collected.
You can have a look at exactly how the data collection is done in this script. It is not immediately obvious that pre-compilation is done there, but looking into the specific implementation files you will find it, for example, https://github.com/lanl-ansi/rosetta-opf/blob/cli/optimization.jl#L376-L378
MTK also doesnât do
structural_simplify
on OptimizationProblem right now, which can be a major win because then it solves a much smaller problem. @YingboMa was working on that.
Cool, looking forward to seeing the next iteration.
Thanks for this excellent source of test cases for Nonconvex. For Nonconvex, I would be interested in the results using ReverseDiff as well as SparseDiffTools instead of Symbolics.
See Using other AD packages ¡ Nonconvex.jl and Sparse Jacobian or Hessian ¡ Nonconvex.jl for the relevant docs sections.
@mohamed82008, If you could show me how these lines should be changed in nonconvex.jl to test these approaches then I can do testing on some example network datasets.
these lines should be changed in nonconvex.jl
For SparseDiffTools, just drop the symbolic = true
:
result = Nonconvex.optimize(
model,
IpoptAlg(),
NonconvexCore.getinit(model);
options = IpoptOptions(; first_order=false, sparse=true),
)
For ReverseDiff, you can do:
using AbstractDifferentiation, ReverseDiff
rd_model = abstractdiffy(model, AD.ReverseDiffBackend())
result = Nonconvex.optimize(
rd_model,
IpoptAlg(),
NonconvexCore.getinit(model);
options = IpoptOptions(; first_order=false),
)
rd_model = abstractdiffy(model, ReverseDiffBackend())
I donât make use of tape compilation yet in Nonconvex but ReverseDiff is often quite fast already without compiling a tape for each function.
Also how can I run all these problems for Nonconvex locally?