Large - Scale Nonlinear Convex Optimization - Performance

Hi everybody, I am trying to solve a large-scale nonlinear problem that is convex and has linear inequality constraints. There are 550 unknowns and 25000 linear inequality constraints. I want to get your recommendations on which solver is better suited for solving this problem and if there are any obvious performance flaws in my current formulation.

The Mathematical Problem is the following:

\min_{x_{i}\geq 0 , y_{j}} \sum_{i=1}^{I}x_{i}C_{i} + \sum_{j=1}^{J}a_{j}{y_{j}}^{1-\sigma}

\text{s.t}\; \; y_{j} \leq \tau_{ij}\left(\frac{w_{i}}{\varepsilon_{i}}+x_{i}\right) \hspace{5.5pt}\forall i,j

with \sigma>1.

In the MWE I am providing I am using JuMP with Ipopt. It seems to me that the fact that I have the y_{j}^{1-\sigma} terms prevent me from using Convex.jl or a more powerful commercial solver like Gurobi or Mosek. One option would be to linearize this term and add an extra constraint, but that would add j additional variables and constraints and it is not clear to me that a convex solver can handle this type of concern. At this point the performance of my MWE is disappointing as I would need to improve this around 100x to make it feasible to implement.

Are there any other recommended solvers (open-source or commercial) that you would recommend?

using JuMP, Ipopt, LinearAlgebra, Random, Distributions, BenchmarkTools

const Ι = 500
const J = 50
const σ = 5

#Create a vector of productivities

const z=exp.(rand(Normal(0,1),Ι))

const β=exp.(rand(Normal(0,1),J))./10

C = ones(Ι).*100000000


# Create a Matrix of Distances

function distmat(J,Ι)
    Distances = zeros(J,Ι)
    Random.seed!(7777)
    coordinates_market = 100*rand(J,2)
    coordinates_plant = 100*rand(Ι,2)
    for j = 1:J, l=1:Ι
        Distances[j,l] = sqrt((coordinates_market[j,1]-coordinates_plant[l,1])^2+(coordinates_market[j,2]-coordinates_plant[l,2])^2)
    end
    return 1 .+ Distances./100
end

const τ = distmat(J,Ι)

function solving_max_pi_constraints_dual(C,β,τ,z)
    (J,Ι) = size(τ)
    dual_profits = Model(Ipopt.Optimizer)
    #set_optimizer_attribute(dual_profits, "nlp_scaling_method", "none")
    #set_silent(dual_profits)
    a = ((σ-1)/σ)^(σ-1)/(σ).*β.^(σ)
    #set_optimizer_attribute(primal_capacity, "nlp_scaling_method", "none")
    @variable(dual_profits, x[1:Ι] >= 0)
    @variable(dual_profits, y[1:J]>= 0)
    @expression(dual_profits, part1, x'*C)
    @NLexpression(dual_profits, part2, sum(a[j]*y[j]^(1-σ) for j=1:J)    )
    @NLobjective(dual_profits, Min, part1 + part2)

    for i=1:Ι, j=1:J
		@constraint(dual_profits, τ[j,i]/z[i] + τ[j,i]*x[i] - y[j] >= 0)
	end   

    optimize!(dual_profits)

    return objective_value(dual_profits), value.(x), value.(y)

end 

(sol,xsol,ysol) = @time solving_max_pi_constraints_dual(C,β,τ,z)

Total seconds in IPOPT                               = 6.624

EXIT: Optimal Solution Found.
  6.693756 seconds (9.70 M allocations: 235.620 MiB, 0.57% gc time)

1 Like

Since the exponent is negative, y^(1-sigma) is convex if I’m not mistaken. That doesn’t mean Convex.jl will necessarily accept that objective function though, it is kind of picky in my (very very limited) experience.

Yes, that is my understanding as well. But still, the function is a very simple convex problem with linear constraints. Can you recommend a solver that might handle the problem efficiently?

Obtener Outlook para iOS

The following assumes that you’ve already eliminated common performance issues (see performance tips) such as not timing in global scope, not timing compilation, etc…

Performance-wise, the best thing to do for convex problems (as yours) is to convert to conic form and pass it to a conic solver such as Mosek (commercial) or Hypatia (open-source).

There’s mainly 2 ways to do that:

  • Manually write your problem in conic form
  • Formulate your problem with Convex.jl, which will reformulate it into conic form before passing it to a conic solver.

The only part of your problem that needs attention is that nonlinear term y^{1-\sigma} in the objective. This is of the form y^{p}, where p = 1 - \sigma <0, which can be represented with a power cone. As far as I can see, everything else is linear.

Power cones are supported in JuMP.

Summing up:

  1. Introduce a variable t in you model
  2. Add the constraint t \geq y^{1-\sigma} using a power cone, and substitute y^{1 - \sigma} with t in the objective
  3. Solve with a conic solver

Amazing! This is exactly what I was looking for. Thanks so much. I’ll code it and let you know the results.

If you can’t use Convex.jl, Ipopt is a good option. Your problem has a lot of inequality constraints, which are notably more difficult to handle than equality constraints. In that case, interior-point methods (such as Ipopt) are usually better suited than active-set methods.

Solving the problem with MOSEK with this code improves performance a lot. Still if you see any potential performance improvement I would pretty much appreciate it since reducing computation time
is critical for me right now.

using JuMP, Ipopt, LinearAlgebra, Random, Distributions, BenchmarkTools, MosekTools

const Ι = 500

const J = 50

const σ = 5

#Create a vector of productivities

const z=exp.(rand(Normal(0,1),Ι))

const β=exp.(rand(Normal(0,1),J))./10

C = ones(Ι).*100000000

# Create a Matrix of Distances

function distmat(J,Ι)

    Distances = zeros(J,Ι)

    Random.seed!(7777)

    coordinates_market = 100*rand(J,2)

    coordinates_plant = 100*rand(Ι,2)

    for j = 1:J, l=1:Ι

        Distances[j,l] = sqrt((coordinates_market[j,1]-coordinates_plant[l,1])^2+(coordinates_market[j,2]-coordinates_plant[l,2])^2)

    end

    return 1 .+ Distances./100

end

const τ = distmat(J,Ι)

function solving_max_pi_constraints_dual_mosek(C,β,τ,z)

    (J,Ι) = size(τ)

    dual_profits = Model(optimizer_with_attributes(Mosek.Optimizer, "QUIET" => false, "INTPNT_CO_TOL_DFEAS" => 1e-7))

    #set_optimizer_attribute(dual_profits, "nlp_scaling_method", "none")

    set_silent(dual_profits)

    a = ((σ-1)/σ)^(σ-1)/(σ).*β.^(σ)

    #set_optimizer_attribute(primal_capacity, "nlp_scaling_method", "none")

    @variable(dual_profits, x[1:Ι] >= 0)

    @variable(dual_profits, y[1:J]>= 0)

    @variable(dual_profits, t[1:J]>= 0)

    @objective(dual_profits, Min, x'*C .+ a'*y)

    for i=1:Ι, j=1:J

        @constraint(dual_profits, τ[j,i]/z[i] + τ[j,i]*x[i] - t[j] >= 0)

    end  

    for j = 1 : J

        @constraint(dual_profits, [y[j],t[j],1] in MOI.PowerCone(1/σ))

    end    

    optimize!(dual_profits)

    return objective_value(dual_profits), value.(x), value.(t)

end

(sol1,xsol1,ysol1) = @btime solving_max_pi_constraints_dual_ipopt(C,β,τ,z)
5.730 s (8908488 allocations: 220.48 MiB)

(sol2,xsol2,ysol2) = @btime solving_max_pi_constraints_dual_mosek(C,β,τ,z)
210.809 ms (1108880 allocations: 85.55 MiB)

So the improvement is huge with respect to Ipopt.

If the printed output is saying after the optimization, does this mean that there is not a big margin for improving performance?

Optimizer terminated. Time: 0.23

If most of the time is spent in Mosek (as it seems to be the case), there isn’t much room to improve the rest of the code, indeed. Thus, if the current approach is still not fast enough for you, you would have to look into alternative models or alternative algorithms.

The one thing I can see that might improve the speed here, is that your model seems to have very large coefficients. For instance, you set C =10^{8}. If you can reduce that value it might help Mosek converge a bit faster (fewer iterations).

You could try replacing \tilde{x} = \sqrt{C} \cdot x, so that the objective becomes \sqrt{C} \cdot \tilde{x}, and the linear constraints involve the term \frac{1}{\sqrt{C}}{\tilde{x}}. Improvements, if any, wouldn’t be dramatic though.

Yes, I think I can improve the scaling of the magnitudes in the model.