Optim.jl also performs AD on the constraints, (via ForwardDiff.jl). If I understand correctly, the support of non linear constraints in GalacticOptim.jl is done via NLSolversBase.jl (direct dependency of Optim)
Aaaah thatās nice. Is that based on GitHub - JuliaDiff/AbstractDifferentiation.jl: An abstract interface for automatic differentiation. or is it anyway independent of GalacticOptim? I guess something like this makes sense in general
I think among all the packages that tick all the boxes, the difference between them will be in the philosophy of each package. I will explain the main philosophy behind Nonconvex.jl. In Nonconvex.jl, I am generally committed to the following:
- Vectors, arrays, structs, tuples, named tuples, dictionaries are all fair game as decision variables. So you can write your objective and constraint functions to take these data structures as inputs. Flattening and unflattening happens behind the scene in a Zygote-compatible way.
- Zygote is used as a glue and default AD package but other AD packages can be used via https://github.com/JuliaNonconvex/NonconvexUtils.jl.
- No attempt to unify options between solvers is made. Diversity in solversā options and option names is celebrated
- Special function traits are passed in as flags which gives hints to the solver package to handle some of the functions differently. For example, I label expensive constraints as expensive for surrogate-assisted solvers to approximate these constraints only.
- Dead simple extensibility for implementing or wrapping new solvers. Most solver wrappers are 100-200 lines of mostly copied and pasted code.
- No attempt to shy away from heavy AD dependencies is made. For instance, I depend on Zygote in the core package to kill the argument there and then.
- Every solver wrapper is its own package to avoid having to pull in GPL dependencies if one solver you are not interested in is GPL licensed.
- Exotic optimisation problems such as: non-convex semidefinite programming, bi-level optimisation, non-convex chance-constrained optimisation, etc. are some of my main personal interests and I have a detailed plan of how to implement them in a user-friendly way although I need to find more time to spend on that.
I think the above points more or less summarise my approach with Nonconvex.jl so far. Due to the limited number of users so far, I have had quite a lot of flexibility to re-imagine non-convex optimisation software and implement the API I wanted to use in there. Of course itās far from perfect even by my own standards and it probably has some bugs but I think itās a good package overall.
ProximalAlgorithms.jl is missing, and in your taxonomy would be (it supports constraints, but only the ones you can āeasilyā project on, in the sense that there is a closed form projection)
That is not referenced anywhere in the code. https://github.com/SciML/GalacticOptim.jl/blob/master/src/function/forwarddiff.jl. We only hook into overloads Optim.jl wants in the Optim.jl calls, because of course, you have to when calling Optim.jl.
GalacticOptim.jl came first. We need to refactor a bit and reduce the code size by using AbstractDifferentiation in the near future, but that shouldnāt come with a major API change and would be mostly under the hood. GalacticOptim.jl and DiffEqSensitivity.jl are two (of many) reasons for the project.
Yes, you can use forwarddiff for the gradient, hessian of the objective, jacobian of the constaints, and hessian contributions.
I believe another argument could be made for large-scale effort done by solvers.
For instance, do they require the computation and storage of large matrices such as (sparse ?) Hessian and Jacobian and their factorization? Or are they matrix/factorization-free?
Yes, that would be a good one to add. While we can do the sparse AD on the Julia side with SparseDiffTools.jl, whether they accept sparse matrices or a Krylov method is something up to the solver.
Thanks everyone for your input on this! I have confirmed that GalacticOptim and Optim also meet the basic requirements and have added them to the list!
At first it was not clear if GalacticOptim supported NL constraints, but an example I found in the packageās tests proved sufficient for explaining the API.
At first it was not clear to me how Optim supported AD. Then I noticed that TwiceDifferentiable
and TwiceDifferentiableConstraints
use ForwardDiff under the hood when explicit oracle functions are not provided. Implementations can be found in NLSolversBase.
Yeah we should add a tutorial on this. Thanks for the feedback.
Same for FrankWolfe.jl which supports arbitrary once-differentiable objective and structured constraints (any compact constrained set over which one can easily optimize a linear function). This includes all MOI-defined sets and closed-form ones.
I think this one would be , correct? From a quick look I didnāt see obvious examples for how to use AD.
AD is irrelevant for Frank-Wolfe because the user is providing a gradient function, they can define it themselves or through AD and exploit Jacobian sparsity or not
Fair point. I have noticed a number of Julia packages for optimization and nonlinear equation solving expect the user to provide the gradient oracles (both Jacobean and Hessian in the optimization case). In this survey I tended to exclude these cases as one of my basic requirements is an AD system. As mentioned in the original post here,
Nice comparison, thanks for the great overview and thanks for including Manopt!
I think the classification is correct, but two small comments on the constraints:
- We are working towards a RALM (Riemannian Augmented Lagrangian) so it will hopefully be available soon
- The idea of optimising over a manifold is to some extend to avoid constraints in the first place, i.e. rephrase the constraint optimisation problem into an unconstraint one on the manifold. This for sure does not yet make constraints obsolete (see 1), itās just that constraints on manifolds have only been investigated for the last (very) few years.
Just to give a small update here (it took a while longer than we thought), there are now constraints in Manopt.jl that can be used, see Exact Penalty Method Ā· Manopt.jl and Augmented Lagrangian Method Ā· Manopt.jl.
Thanks for the update @kellertuer! Might you be interested in making a PR to GitHub - lanl-ansi/rosetta-opf: AC-OPF Implementations in Various NLP Modeling Frameworks showing how Manopt can be used to encode this AC-OPF problem? If so, I can benchmark it in that context.
This is great news! Did you compare your implementation to Percival.jl? AugLag has many variants with wildly different performance.
Chris hinted that Optimization.jl might have improved recently, so we could also revisit that: A plan for a more automated Optimization.jl Ā· Issue #397 Ā· SciML/Optimization.jl Ā· GitHub
I will check, but this month is a little busy with exams, so it might take a while.
edit: Is there some overview where to find the function and its constraints? Or is is a whole library of problems? Ah! In dark mode the black formlulae do not have a great contrast
I am also not so sure how well a package designed for optimisation on manifolds developed by just very few people would perform on Euclidean constraint optimisation - so maybe it is not so useful to invest a few days to try that? Until now your benchmark seems quite large but also not that easy to implement a test for?