Survey of Non-Linear Optimization Modeling Layers in Julia

ccoffrin · March 20, 2022, 3:39pm

I am doing a survey of the NLP modeling layers in Julia to see which might be applicable to the kinds of problems that I regularly solve. I have the following basic requirements of the modeling layer,

Support for non-convex functions for the objective
Support for a system non-convex equality and inequality constraint functions (the equality constraints usually cannot be expressed explicitly as a manifold)
Support for polynomial and transcendental functions (e.g. x^2*y^3,sin(x))
Some kind of automatic differentiation system (so I don’t have to write derivative oracles by hand)

Here is what I found so far (in alphabetical order),

Package	NL Objective	NL Constraints	AD
ADNLPModels
BlackBoxOptim			NA
Convex			NA
ExaModels
JuMP
Manopt
Nonconvex
Optim
Optimization
ProximalAlgorithms

If I have made an error in the characterization of any listed packages corrections are greatly appreciated.

So far, it seems that ADNLPModels, GalacticOptim, JuMP, Nonconvex and Optim are the packages that currently support all of my requirements. However, if you know of some other Julia package that might be able to solve such problems, I would very much like to hear about it.

Note: original post has been revised based on info from this discussion.

ChrisRackauckas · March 20, 2022, 4:22pm

You’re missing GalacticOptim.jl which has probably the most coverage via:

https://galacticoptim.sciml.ai/dev/

ChrisRackauckas · March 20, 2022, 4:23pm

And wide AD support:

mohamed82008 · March 20, 2022, 4:53pm

Optim has IPNewton which supports NL constraints Optim.jl

odow · March 20, 2022, 5:51pm

What does global constrained column mean? And why does MOI not support that?

ChrisRackauckas · March 20, 2022, 6:29pm

We haven’t wrapped it yet.

odow · March 20, 2022, 7:10pm

I guess I don’t understand the distinction between global and local. It’s up to the solver to figure that out, not MathOptInterface.

joaquimg · March 20, 2022, 8:35pm

Which algorithms Optim rely on for Global constrained and unconstrained?

ChrisRackauckas · March 21, 2022, 12:26am

Yes, MOI has both algorithms, which is why it has both boxes checked. Flux does not, so it only has one of them checked.

ChrisRackauckas · March 21, 2022, 12:27am

https://galacticoptim.sciml.ai/dev/optimization_packages/optim/#Global-Optimizer

joaquimg · March 21, 2022, 1:01am

I think he means the very last column, which is not checked for MOI.

joaquimg · March 21, 2022, 1:12am

Thanks! Now I see the difference in terminology. In MOI/JuMP, the difference between global and local is the capacity of a solver to certify (of course, there might be numerical issues) that the solution is a global optimum or only a local optimum. For GalacticOptim, it seems more about how the feasible space is explored.

ChrisRackauckas · March 21, 2022, 1:50am

In GalacticOptim, MOI is a package that is wrapped, and that wrapper does not support the constraints right now so it’s not checked.

ccoffrin · March 21, 2022, 3:44am

@mohamed82008, thanks for the tip about Optim. In this example I did not see how to combine what is shown here with an AD approach for the Jacobian and Hessian. Do you know of an example doing this?

@ChrisRackauckas, I will give GalaticOptim a try going to Ipopt through the MOI backend, unless you suggestion a different one. What AD system do you recommend for sparse large scale problems? AutoModelingToolkit sounds like the best choice from the docs you post, correct?

mohamed82008 · March 21, 2022, 4:02am

You would probably need to do your own AD when defining gradient/jacobian/hessian functions. @pkofod can correct me if I am wrong. (sorry for the ping)

ccoffrin · March 21, 2022, 4:10am

Ok, an AD system is a hard requirement for me. Will wait to hear from @pkofod to confirm the status, but updating the original post to reflect new info.

ccoffrin · March 21, 2022, 4:16am

@ChrisRackauckas, do you have an example of how to use GalaticOptim with constraint functions? I reviewed these docs,

https://galacticoptim.sciml.ai/stable/tutorials/intro/
https://galacticoptim.sciml.ai/stable/API/optimization_problem/
https://galacticoptim.sciml.ai/stable/API/optimization_function/

There seems to be a hint of how to specify the constraints through the cons argument to OptimizationFunction. But a specification of what this argument should be I could not find.

Also while reviewing these docs,

https://galacticoptim.sciml.ai/stable/API/optimization_function/#Defining-Optimization-Functions-Via-AD

I noticed that AutoForwardDiff is only AD system that says it supports constraints, so it seems like I should use this one instead of AutoModelingToolkit?

ChrisRackauckas · March 21, 2022, 4:19am

Not necessarily. Each has its own advantages. MTK will scalarize the equations but will generate really fast code. It won’t scale in compile time the best, but for scalar-heavy code that is big and sparse it’s really good, if it compiles in time. Otherwise ReverseDiff with tape compilation is good with similar properties, but it can segfault if the tape gets too long. If the code is heavy in linear algebra, Zygote is a good bet. Tracker is kind of an in-between Zygote-ish thing that can work in some cases where Zygote doesn’t. Forward-mode doesn’t scale as well.

Yes, we should probably add a cons diff overload to MTK. It’s only like 10 lines.

ccoffrin · March 21, 2022, 4:25am

@ChrisRackauckas, I found an example in the tests here, https://github.com/SciML/GalacticOptim.jl/blob/master/test/rosenbrock.jl#L30

jd-foster · March 21, 2022, 5:54am

Not sure if I understand the question, but here is a simple example of AD using the Optim ecosystem:

using NLsolve
import NLsolve.NLSolversBase: OnceDifferentiable, TwiceDifferentiable

# Beale function:
B(x) = (1.5 - x[1] + x[1].*x[2]).^2 + (2.25 - x[1] + x[1].*x[2].^2).^2 + (2.625 - x[1] + x[1].*x[2].^3).^2
## Himmelblau function:
HM(x) = (x[1].^2 + x[2] - 11).^2 + (x[1]+ x[2].^2 - 7).^2
# Rastrigin function:
RS(x) = 10*2 + x[1].^2 + x[2].^2 - 10*cos(2*pi*x[1]) - 10*cos(2*pi*x[2]);
# Rosenbrock function
R(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

## Define the test function:
testfun = RS

## Auto-differentiation:
x0 = [10.0; 10.0] # initial point
dfn = TwiceDifferentiable(testfun,x0);

# Find the zeros of the gradient and the iterate states:
base_solver_results = nlsolve(dfn.df, x0,
        method=:newton,
        show_trace=true, store_trace=true, extended_trace=true)

# We can get the vector residual function as the gradient of the test function as:
x_init = zero(x0)
testDiffFun = OnceDifferentiable(dfn.df,x_init,x_init);

Topic		Replies	Views
Nonlinear Optimization with Many Constraints + Autodifferentiation: Which Julia Solution? Optimization (Mathematical) jump , ipopt , nonlinear-optimizati	23	916	March 15, 2024
ANN: Nonconvex.jl a toolbox for AD-based constrained non-convex optimization Package Announcements package , optimization	34	3399	September 5, 2021
Optimization on a manifold Optimization (Mathematical) question	33	1897	June 21, 2021
Comparison between Nonconvex.jl, JuMP and CasADi for large sparse nonlinear optimization Optimization (Mathematical) control	8	1697	April 1, 2022
AC Optimal Power Flow in Various Nonlinear Optimization Frameworks Optimization (Mathematical) optimization , nonlinear	101	6020	November 7, 2024

Survey of Non-Linear Optimization Modeling Layers in Julia

Related topics