NLopt - Same complex clalculations in objetive function and nonlinear constraints

wg030 · June 18, 2020, 4:15pm

Hey guys,
I need a hint on how to effectivley solve a nonlinear optimization problem with NLopt package when my objective function contains some complex calculations and my nonlinear constraint contains the same complex calculations, something like the following:

using NLopt

function myfunc(x::Vector, grad::Vector)
    if length(grad) > 0
        # gradient calculation
    end
    Y = SomeComplexCalculations( x)
    return UseYForObjective( Y)
end

function myconstraint(x::Vector, grad::Vector)
    if length(grad) > 0
        # gradient calculation
    end
    Y = SomeComplexCalculations( x)
    return UseYForConstraint( Y)
end

Of course, I could do it as shown above, but that looks very unefficient to me.
I also thought about making Y a global variable, but I am not sure if thats possible at all or I might get into trouble during the optimization process.
So I wonder if there is another way of giving data calculated by the objective function to the constraints.
Thanks for any suggestion or comment in advance!

mzaffalon · June 25, 2020, 6:34am

Yes, it looks like it is the way to go.

wg030 · June 28, 2020, 9:07am

Which one is the way to go accroding to you opinion?
The approach that I posted the code example for or the approach with global variables?

I am asking because I am unsure whether global variables will really make it at all. I mean I am not familiar to the internals of NLopt, but I see a potential problem here if I decide to call SomeComplexCalculations( x) in myfunc only since this implies that myconstraint is just called with the same vector x until myfunc is called again. However if myconstrained was called more than once with two different x during the optimiziation process and myfunc was only called after a suitable x that fullfills the condition was found, I would get into trouble.

mzaffalon · June 28, 2020, 3:59pm

I am not an expert but I think the way to go is caching the result of the last calculation in two variables, x_cached and y_cached. Then before performing the expensive computation, you check whether the vector x is the same as x_cached for which you have already computed Y and stored it into y_cached.

using NLopt

function SomeComplexCalculations(x)
    println("complex calculation with $x")
    return x.^2
end


function myoptim()
    function myfunc(x::Vector, grad::Vector)
        if length(grad) > 0
            grad[1] = 0
            grad[2] = 0.5/sqrt(x[2])
        end
        if x_cached != x
            println("objfunc: not cached      $x")
            x_cached[:] = x
            y_cached[:] = SomeComplexCalculations(x)
        else
            println("objfunc: using cached $x")
        end
        return sqrt(x[2])
    end

    function myconstraint(x::Vector, grad::Vector, a, b)
        if length(grad) > 0
            grad[1] = 3a * (a*x[1] + b)^2
            grad[2] = -1
        end
        if x_cached != x
            println("constraint: not cached $x")
            x_cached[:] = x
            y_cached[:] = SomeComplexCalculations(x)
        else
            println("constraint: using cached $x")
        end
        (a*x[1] + b)^3 - x[2]
    end

    opt = Opt(:LD_MMA, 2)
    x_cached = Vector{Float64}(undef, 2)
    y_cached = Vector{Float64}(undef, 2)

    opt.lower_bounds = [-Inf, 0.]
    opt.xtol_rel = 1e-4

    opt.min_objective = myfunc
    inequality_constraint!(opt, (x,g) -> myconstraint(x,g,2,0), 1e-8)
    inequality_constraint!(opt, (x,g) -> myconstraint(x,g,-1,1), 1e-8)

    (minf,minx,ret) = optimize(opt, [1.234, 5.678])
    numevals = opt.numevals # the number of function evaluations
    println("got $minf at $minx after $numevals iterations (returned $ret)")
end

I was surprised to see that the objective function is computed first and then the constraints, but this may have to do with the choice of the algorithm.

wg030 · June 29, 2020, 4:01pm

That looks like a smart solution.
Thanks for your support!

stevengj · June 29, 2020, 4:21pm

IIRC, NLopt always evaluates the objective before the constraints.

mzaffalon · June 29, 2020, 7:10pm

In my ignorance, I assumed that one may want to check whether the solution is feasible before computing the objective value.

stevengj · June 30, 2020, 12:28am

Even if the point is infeasible, the objective value is still important to determine the next optimization step.

mzaffalon · June 30, 2020, 7:02am

The risk is to sample a point that does not belong to the objective function domain, e.g. a negative number when f(x) = sqrt(x).

stevengj · June 30, 2020, 12:37pm

NLopt guarantees that the function will never be evaluated outside the feasible set for box constraints, but not for arbitrary nonlinear constraints. (This is true for most nonlinear-programming algorithms that I’m aware of.) So if you have \sqrt{x}, you should use a box constraint x \ge 0 and you will be fine.

(For general constraints, this may not even be possible, e.g. you may have a pair of inequality constraints h(x) \le 0 and h(x) \ge 0, which means the feasible set is in a nonlinear manifold h(x)=0, and it is not practical to stay exactly on such a manifold in general.)

Tamas_Papp · June 30, 2020, 1:53pm

I think you are right, but I am wondering if some simple heuristic could remedy this. Eg if the feasible set D is convex, x \in D, and y \notin D, one could find a z = \alpha x + (1 - \alpha) y \in D using bisection. (This is just a random idea, maybe it has been tried and discarded).

stevengj · June 30, 2020, 6:07pm

One could certainly devise methods for this. Assuming that the feasible set has a nonempty interior (no equality constraints) and one starts with a feasible point, you could simply shrink the size of the trust region (or whatever method is used to limit step sizes) until the next step is feasible.

However, this may come at a price in convergence rate, since information from function evaluations outside the feasible set may be used by an algorithm to accelerate convergence, e.g. the CCSA algorithm uses this information. Moreover, other algorithms like SLSQP don’t assume that the feasible set has a nonempty interior. (It would be pretty easy to add this as a flag for CCSA, however.) More radical modifications might be needed for penalty methods such as the augmented-Lagrangian approach.

Typically, however, if you have a constraint that must be satisfied in order to evaluate your objective, I would suggest that you impose it as a box constraint or to build it into your parameterization of the problem. (e.g. if you have \sqrt{x}, parameterize x as y^2 so that it can never be negative.) This gives you maximum flexibility in your choice of algorithms.

mzaffalon · July 1, 2020, 4:40am

I am not sure how this generalize, when for instance the objective value is f(\sqrt{x}) + g(\sqrt{y}) + h(\sqrt{x-y}), where the domain is the triangle with vertices (0,0), (1,0), (0.1). A change of variables does not help, I guess.

EDIT: The last term should have been h(\sqrt{1-x-y}) to form a triangle…

Tamas_Papp · July 1, 2020, 5:50am

For your example, I think the domain is x \ge y \ge 0, which can be transformed to (y, z) \ge 0 with x = y + z.

But yes, simplexes (and generally polyhedra) are tricky to transform to a box because you get a singularity at one or more vertices. If you can exclude these vertices (eg if function is continuous, just evaluate it at all vertices and do the transformation so that the singularities are at the ones which cannot be optimal).

Admittedly, these transformations do look a bit tedious and it would be nice to have a method that is clean and pre-packaged and handles everything automatically, but since there isn’t one, it is generally worthwhile to think about transformations.

pierre-haessig · July 19, 2021, 12:50pm

Hi,

I’m reviving the topic because I was solving the same problem (how to share complex calculation between NLopt’s objective and constrainct function), but with an additional twist: using ForwardDiff to compute gradient automatically.

I’ve placed my detail example in a Gist. Here is the summary. Please let me know what you think of this approach.

Also, while writing this post, I came across the open discussion NLopt.jl #128 Easier auto-differentiation support?. Since it seems this was not finalized, maybe my example can be used by other people who wish to use NLopt+ForwardDiff for their constrained blackbox optimization problem.

(Final word: can a forum moderator fix the typo in the thread title?)

Example of NLopt+ForwardDiff to only call the “simulator” once

(simulator: the expensive function which jointly computes the objective and the constraints)

Objective and constraints of this example come from a toy problem “Rosen-Suzuki” I found by chance in SAS IML documentation (and I discovered this Matlab-like environment at the same time) which itself cites (Hock & Schittkowski 1981 “Test Examples for Nonlinear Programming Codes”). Optimum is -44 at [0,1,2,-1].

using NLopt, ForwardDiff

f(x) = x[1]^2 + x[2]^2 + 2x[3]^2 + x[4]^2 - 5x[1] - 5x[2] - 21x[3] + 7x[4]
g1(x) = -8 + x[1]^2 + x[2]^2 + x[3]^2 + x[4]^2 + x[1] - x[2] + x[3] - x[4]
g2(x) = -10 + x[1]^2 + 2x[2]^2 + x[3]^2 + 2x[4]^2 - x[1] - x[4]
g3(x) = -5 + 2x[1]^2 + x[2]^2 + x[3]^2 + 2x[1] - x[2] - x[4]

Now let’s pretend these are all computed in one master function simulate_vec(x):

simcount = 0 # Global counter of simulator calls:

function simulate_vec(x)
    global simcount += 1
    return [f(x), g1(x), g2(x), 3(x)]
end

Now the generator function for the objective and constraint functions which share a common cache:

function cached_objcons_single()
    x_prev = zeros(4)*NaN
    val_prev = zeros(4)*NaN
    Jac_prev = zeros(4,4)*NaN
    result = DiffResults.JacobianResult(zeros(4), zeros(4)) # (output, input)
    
    function runsim(x)
        x_prev[:] = x
        result = ForwardDiff.jacobian!(result, simulate_vec, x);
        val_prev[:] = DiffResults.value(result)
        Jac_prev[:] = DiffResults.jacobian(result)
    end
    
    function myf(x::Vector, grad::Vector)
        if x != x_prev
            # update cache
            runsim(x)
        end
        if length(grad) > 0
            grad[:] = Jac_prev[1,:]
        end
        return val_prev[1]
    end

    function mygvec(result::Vector, x::Vector, grad::Matrix)
        if x != x_prev
            # update cache
            runsim(x)
        end
        if length(grad) > 0
            # Watch out the transpose due to convention mismatch between NLopt and ForwardDiff.
            # Without it, we get unfortunately no error,
            # but the convergence is, of course, poor
            grad[:] = transpose(Jac_prev[2:4,:])
        end
        result[:] = val_prev[2:4]
    end
    
    return myf, mygvec
end

Then create the cached objective and constraints functions:

myf_scache, mygvec_scache = cached_objcons_single();

And finally build the NLopt Opt structure and solve:

# Solver choice: SLSQP, MMA
opt = Opt(:LD_SLSQP, 4)
#opt = Opt(:LD_MMA, 4)

opt.xtol_rel = 1e-6

# Set objective and constraints
opt.min_objective = myf_scache
inequality_constraint!(opt, mygvec_scache, [0., 0., 0.])

# Starting point
x0 = [2, 1., 1., 1.] # ⚠ SLSQP can get stuck when starting at [1,1,1,1]

# Solve and display
simcount = 0 # reset simulation call counter
(minf,minx,ret) = optimize(opt, x0)
numevals = opt.numevals # the number of function evaluations
println("f* = $minf after $numevals iterations (returned $ret)")
println("xopt = $minx")
println("simulator calls: $simcount ($(simcount/numevals) per iteration)")

As a result, it’s interesting to note that on this example, there are 0.8 simulator calls per iteration for SLSQP (3.1 without caching) or 0.87 simulator calls per iteration for MMA (4.0 without caching). This is slightly below the results I was expecting (1 call per iteration) and I suppose this relates to the way iterations are defined. Is it possible that some algorithms call the objective or the constraints several times at same point?

mohamed82008 · July 19, 2021, 12:54pm

The way I would do it is as follows:

Change the objective to a constraint in the formulation, e.g. min f(x) becomes f(x) <= c and min c.
Use https://github.com/mohamed82008/Nonconvex.jl to define a “block function” that returns a vector for all the constraints simultaneously.
You can additionally tell Nonconvex to use ForwardDiff to get the Jacobian of this function by defining an adjoint rule. See the docs.

pierre-haessig · July 19, 2021, 1:20pm

Thanks for the feedback! I had seen your post announcing Nonconvex (and the interesting subsequent discussion), but I’ve not taken the time to dig into using it. About the same days, somebody else pointed me to Julia Smooth Optimizers: too many things to read!

I had not thought about moving the objective as a constraint. This indeed removes the need for caching since NLopt supports vector constraints. However, I suspect this requires further investigation to see how each algorithm reacts. Indeed, this is a standard trick in Linear or Conic Programming (e.g. to transform a QP into SOCP if I remember well), but I’m not sure that all NLP solvers would react well to this apparent absence of objective.

About Nonconvex, I was about to say that in my usecase I need ForwardDiff(*), but I see deep in the doc that using ForwardDiff instead of Zygote is possible. Perhaps you can adapt your Getting started (in item “The use of Zygote.jl for automatic differentiation (AD) of the objective and constraint functions. Specifying analytic gradients is also possible…”) to reflect this. This also reminds me that I have to investigate ChainRules as well…

(*) or more truely: I’m pretty happy by ForwardDiff, very unsatisfied by ReverseDiff and I’ve yet to investigate more AD packages like Zygote. I’ve less than ten inputs and outputs.

mohamed82008 · July 19, 2021, 1:31pm

That’s fair. For MMA, moving the objective to the constraint won’t make a significant difference since the objective and constraints are handled the same way. According to the docs, SLSQP uses linear approximation for the constraints and quadratic for the objective so that will behave differently. Ipopt (through Nonconvex) shouldn’t be significantly different in terms of performance since it solves for the optimality conditions as a nonlinear system.

Topic		Replies	Views
NLopt Constraint Optimization (Mathematical) optimization , nlopt , gradient	4	208	May 28, 2024
Equality Constraint format in NLopt New to Julia optimization	5	1973	May 24, 2019
NLopt not optimising General Usage optimization	13	1023	December 14, 2020
Nonlinear constrained optimization without gradient Optimization (Mathematical) question , optimization	3	589	January 23, 2023
Make NLOpt ignore some candidates New to Julia optimization	15	656	February 8, 2021

NLopt - Same complex clalculations in objetive function and nonlinear constraints

Example of NLopt+ForwardDiff to only call the “simulator” once

Related topics