NLopt - Same complex clalculations in objetive function and nonlinear constraints

Hey guys,
I need a hint on how to effectivley solve a nonlinear optimization problem with NLopt package when my objective function contains some complex calculations and my nonlinear constraint contains the same complex calculations, something like the following:

using NLopt

function myfunc(x::Vector, grad::Vector)
    if length(grad) > 0
        # gradient calculation
    end
    Y = SomeComplexCalculations( x)
    return UseYForObjective( Y)
end

function myconstraint(x::Vector, grad::Vector)
    if length(grad) > 0
        # gradient calculation
    end
    Y = SomeComplexCalculations( x)
    return UseYForConstraint( Y)
end

Of course, I could do it as shown above, but that looks very unefficient to me.
I also thought about making Y a global variable, but I am not sure if thats possible at all or I might get into trouble during the optimization process.
So I wonder if there is another way of giving data calculated by the objective function to the constraints.
Thanks for any suggestion or comment in advance!

4 Likes

Yes, it looks like it is the way to go.

Which one is the way to go accroding to you opinion?
The approach that I posted the code example for or the approach with global variables?

I am asking because I am unsure whether global variables will really make it at all. I mean I am not familiar to the internals of NLopt, but I see a potential problem here if I decide to call SomeComplexCalculations( x) in myfunc only since this implies that myconstraint is just called with the same vector x until myfunc is called again. However if myconstrained was called more than once with two different x during the optimiziation process and myfunc was only called after a suitable x that fullfills the condition was found, I would get into trouble.

I am not an expert but I think the way to go is caching the result of the last calculation in two variables, x_cached and y_cached. Then before performing the expensive computation, you check whether the vector x is the same as x_cached for which you have already computed Y and stored it into y_cached.

using NLopt

function SomeComplexCalculations(x)
    println("complex calculation with $x")
    return x.^2
end


function myoptim()
    function myfunc(x::Vector, grad::Vector)
        if length(grad) > 0
            grad[1] = 0
            grad[2] = 0.5/sqrt(x[2])
        end
        if x_cached != x
            println("objfunc: not cached      $x")
            x_cached[:] = x
            y_cached[:] = SomeComplexCalculations(x)
        else
            println("objfunc: using cached $x")
        end
        return sqrt(x[2])
    end

    function myconstraint(x::Vector, grad::Vector, a, b)
        if length(grad) > 0
            grad[1] = 3a * (a*x[1] + b)^2
            grad[2] = -1
        end
        if x_cached != x
            println("constraint: not cached $x")
            x_cached[:] = x
            y_cached[:] = SomeComplexCalculations(x)
        else
            println("constraint: using cached $x")
        end
        (a*x[1] + b)^3 - x[2]
    end

    opt = Opt(:LD_MMA, 2)
    x_cached = Vector{Float64}(undef, 2)
    y_cached = Vector{Float64}(undef, 2)

    opt.lower_bounds = [-Inf, 0.]
    opt.xtol_rel = 1e-4

    opt.min_objective = myfunc
    inequality_constraint!(opt, (x,g) -> myconstraint(x,g,2,0), 1e-8)
    inequality_constraint!(opt, (x,g) -> myconstraint(x,g,-1,1), 1e-8)

    (minf,minx,ret) = optimize(opt, [1.234, 5.678])
    numevals = opt.numevals # the number of function evaluations
    println("got $minf at $minx after $numevals iterations (returned $ret)")
end

I was surprised to see that the objective function is computed first and then the constraints, but this may have to do with the choice of the algorithm.

3 Likes

That looks like a smart solution.
Thanks for your support!

IIRC, NLopt always evaluates the objective before the constraints.

2 Likes

In my ignorance, I assumed that one may want to check whether the solution is feasible before computing the objective value.

Even if the point is infeasible, the objective value is still important to determine the next optimization step.

1 Like

The risk is to sample a point that does not belong to the objective function domain, e.g. a negative number when f(x) = sqrt(x).

2 Likes

NLopt guarantees that the function will never be evaluated outside the feasible set for box constraints, but not for arbitrary nonlinear constraints. (This is true for most nonlinear-programming algorithms that I’m aware of.) So if you have \sqrt{x}, you should use a box constraint x \ge 0 and you will be fine.

(For general constraints, this may not even be possible, e.g. you may have a pair of inequality constraints h(x) \le 0 and h(x) \ge 0, which means the feasible set is in a nonlinear manifold h(x)=0, and it is not practical to stay exactly on such a manifold in general.)

3 Likes

I think you are right, but I am wondering if some simple heuristic could remedy this. Eg if the feasible set D is convex, x \in D, and y \notin D, one could find a z = \alpha x + (1 - \alpha) y \in D using bisection. (This is just a random idea, maybe it has been tried and discarded).

One could certainly devise methods for this. Assuming that the feasible set has a nonempty interior (no equality constraints) and one starts with a feasible point, you could simply shrink the size of the trust region (or whatever method is used to limit step sizes) until the next step is feasible.

However, this may come at a price in convergence rate, since information from function evaluations outside the feasible set may be used by an algorithm to accelerate convergence, e.g. the CCSA algorithm uses this information. Moreover, other algorithms like SLSQP don’t assume that the feasible set has a nonempty interior. (It would be pretty easy to add this as a flag for CCSA, however.) More radical modifications might be needed for penalty methods such as the augmented-Lagrangian approach.

Typically, however, if you have a constraint that must be satisfied in order to evaluate your objective, I would suggest that you impose it as a box constraint or to build it into your parameterization of the problem. (e.g. if you have \sqrt{x}, parameterize x as y^2 so that it can never be negative.) This gives you maximum flexibility in your choice of algorithms.

1 Like

I am not sure how this generalize, when for instance the objective value is f(\sqrt{x}) + g(\sqrt{y}) + h(\sqrt{x-y}), where the domain is the triangle with vertices (0,0), (1,0), (0.1). A change of variables does not help, I guess.

EDIT: The last term should have been h(\sqrt{1-x-y}) to form a triangle…

For your example, I think the domain is x \ge y \ge 0, which can be transformed to (y, z) \ge 0 with x = y + z.

But yes, simplexes (and generally polyhedra) are tricky to transform to a box because you get a singularity at one or more vertices. If you can exclude these vertices (eg if function is continuous, just evaluate it at all vertices and do the transformation so that the singularities are at the ones which cannot be optimal).

Admittedly, these transformations do look a bit tedious and it would be nice to have a method that is clean and pre-packaged and handles everything automatically, but since there isn’t one, it is generally worthwhile to think about transformations.

1 Like

Hi,

I’m reviving the topic because I was solving the same problem (how to share complex calculation between NLopt’s objective and constrainct function), but with an additional twist: using ForwardDiff to compute gradient automatically.

I’ve placed my detail example in a Gist. Here is the summary. Please let me know what you think of this approach.

Also, while writing this post, I came across the open discussion NLopt.jl #128 Easier auto-differentiation support?. Since it seems this was not finalized, maybe my example can be used by other people who wish to use NLopt+ForwardDiff for their constrained blackbox optimization problem.

(Final word: can a forum moderator fix the typo in the thread title?)

Example of NLopt+ForwardDiff to only call the “simulator” once

(simulator: the expensive function which jointly computes the objective and the constraints)

Objective and constraints of this example come from a toy problem “Rosen-Suzuki” I found by chance in SAS IML documentation (and I discovered this Matlab-like environment at the same time) which itself cites (Hock & Schittkowski 1981 “Test Examples for Nonlinear Programming Codes”). Optimum is -44 at [0,1,2,-1].

using NLopt, ForwardDiff

f(x) = x[1]^2 + x[2]^2 + 2x[3]^2 + x[4]^2 - 5x[1] - 5x[2] - 21x[3] + 7x[4]
g1(x) = -8 + x[1]^2 + x[2]^2 + x[3]^2 + x[4]^2 + x[1] - x[2] + x[3] - x[4]
g2(x) = -10 + x[1]^2 + 2x[2]^2 + x[3]^2 + 2x[4]^2 - x[1] - x[4]
g3(x) = -5 + 2x[1]^2 + x[2]^2 + x[3]^2 + 2x[1] - x[2] - x[4]

Now let’s pretend these are all computed in one master function simulate_vec(x):

simcount = 0 # Global counter of simulator calls:

function simulate_vec(x)
    global simcount += 1
    return [f(x), g1(x), g2(x), 3(x)]
end

Now the generator function for the objective and constraint functions which share a common cache:

function cached_objcons_single()
    x_prev = zeros(4)*NaN
    val_prev = zeros(4)*NaN
    Jac_prev = zeros(4,4)*NaN
    result = DiffResults.JacobianResult(zeros(4), zeros(4)) # (output, input)
    
    function runsim(x)
        x_prev[:] = x
        result = ForwardDiff.jacobian!(result, simulate_vec, x);
        val_prev[:] = DiffResults.value(result)
        Jac_prev[:] = DiffResults.jacobian(result)
    end
    
    function myf(x::Vector, grad::Vector)
        if x != x_prev
            # update cache
            runsim(x)
        end
        if length(grad) > 0
            grad[:] = Jac_prev[1,:]
        end
        return val_prev[1]
    end

    function mygvec(result::Vector, x::Vector, grad::Matrix)
        if x != x_prev
            # update cache
            runsim(x)
        end
        if length(grad) > 0
            # Watch out the transpose due to convention mismatch between NLopt and ForwardDiff.
            # Without it, we get unfortunately no error,
            # but the convergence is, of course, poor
            grad[:] = transpose(Jac_prev[2:4,:])
        end
        result[:] = val_prev[2:4]
    end
    
    return myf, mygvec
end

Then create the cached objective and constraints functions:

myf_scache, mygvec_scache = cached_objcons_single();

And finally build the NLopt Opt structure and solve:

# Solver choice: SLSQP, MMA
opt = Opt(:LD_SLSQP, 4)
#opt = Opt(:LD_MMA, 4)

opt.xtol_rel = 1e-6

# Set objective and constraints
opt.min_objective = myf_scache
inequality_constraint!(opt, mygvec_scache, [0., 0., 0.])

# Starting point
x0 = [2, 1., 1., 1.] # âš  SLSQP can get stuck when starting at [1,1,1,1]

# Solve and display
simcount = 0 # reset simulation call counter
(minf,minx,ret) = optimize(opt, x0)
numevals = opt.numevals # the number of function evaluations
println("f* = $minf after $numevals iterations (returned $ret)")
println("xopt = $minx")
println("simulator calls: $simcount ($(simcount/numevals) per iteration)")

As a result, it’s interesting to note that on this example, there are 0.8 simulator calls per iteration for SLSQP (3.1 without caching) or 0.87 simulator calls per iteration for MMA (4.0 without caching). This is slightly below the results I was expecting (1 call per iteration) and I suppose this relates to the way iterations are defined. Is it possible that some algorithms call the objective or the constraints several times at same point?

The way I would do it is as follows:

  1. Change the objective to a constraint in the formulation, e.g. min f(x) becomes f(x) <= c and min c.
  2. Use https://github.com/mohamed82008/Nonconvex.jl to define a “block function” that returns a vector for all the constraints simultaneously.
  3. You can additionally tell Nonconvex to use ForwardDiff to get the Jacobian of this function by defining an adjoint rule. See the docs.
1 Like

Thanks for the feedback! I had seen your post announcing Nonconvex (and the interesting subsequent discussion), but I’ve not taken the time to dig into using it. About the same days, somebody else pointed me to Julia Smooth Optimizers: too many things to read!

I had not thought about moving the objective as a constraint. This indeed removes the need for caching since NLopt supports vector constraints. However, I suspect this requires further investigation to see how each algorithm reacts. Indeed, this is a standard trick in Linear or Conic Programming (e.g. to transform a QP into SOCP if I remember well), but I’m not sure that all NLP solvers would react well to this apparent absence of objective.

About Nonconvex, I was about to say that in my usecase I need ForwardDiff(*), but I see deep in the doc that using ForwardDiff instead of Zygote is possible. Perhaps you can adapt your Getting started (in item “The use of Zygote.jl for automatic differentiation (AD) of the objective and constraint functions. Specifying analytic gradients is also possible…”) to reflect this. This also reminds me that I have to investigate ChainRules as well…

(*) or more truely: I’m pretty happy by ForwardDiff, very unsatisfied by ReverseDiff and I’ve yet to investigate more AD packages like Zygote. I’ve less than ten inputs and outputs.

1 Like

That’s fair. For MMA, moving the objective to the constraint won’t make a significant difference since the objective and constraints are handled the same way. According to the docs, SLSQP uses linear approximation for the constraints and quadratic for the objective so that will behave differently. Ipopt (through Nonconvex) shouldn’t be significantly different in terms of performance since it solves for the optimality conditions as a nonlinear system.

1 Like