Performance of Computation of Expectation of slow function

jmcastro2109 · January 20, 2022, 10:52pm

Hi! I am trying to optimize this section of code. Basically, there are two functions. The first one solves a linear programming problem over a convex set. The second function computes a Monte Carlo experiment over S trials of the first function. I can see how the second function can be easy to parallelize, however, this tidbit is part of a larger loop which I am parallelizing. Any help would be very much appreciated!!

using JuMP, LinearAlgebra, Random, Distributions, BenchmarkTools, MosekTools

const Ι = 250
const J = 50
const σ = 5
const α = 1.0
const S= 500

#Create a vector of productivities

const z=exp.(rand(Normal(0,1),Ι,S))

const β=exp.(rand(Normal(0,1),J))./10

C = ones(Ι).*10


# Create a Matrix of Distances

function distmat(J,Ι)
    Distances = zeros(J,Ι)
    Random.seed!(7777)
    coordinates_market = 100*rand(J,2)
    coordinates_plant = 100*rand(Ι,2)
    for j = 1:J, l=1:Ι
        Distances[j,l] = sqrt((coordinates_market[j,1]-coordinates_plant[l,1])^2+(coordinates_market[j,2]-coordinates_plant[l,2])^2)
    end
    return 1 .+ Distances./100
end

const τ = distmat(J,Ι)

function f1(C,β,τ,z)
    (J,Ι) = size(τ)
    opt = Model(optimizer_with_attributes(Mosek.Optimizer, "QUIET" => false, "INTPNT_CO_TOL_DFEAS" => 1e-7))
    set_silent(opt)
    a = ((σ-1)/σ)^(σ-1)/(σ).*β.^(σ)
    @variable(opt, x[1:Ι] >= 0)
    @variable(opt, y[1:J]>= 0)
    @variable(opt, t[1:J]>= 0)
    @objective(opt, Min, x'*C .+ a'*y)

    for i=1:Ι, j=1:J
		@constraint(opt, τ[j,i]/z[i] + τ[j,i]*x[i] - t[j] >= 0)
	end   

    for j = 1 : J
        @constraint(opt, [y[j],t[j],1] in MOI.PowerCone(1/σ))
    end    

    optimize!(opt)

    return objective_value(opt), value.(x), value.(t)

end 

function f2(Cap,β,τ,z)
    (J,Ι) = size(τ)
    Simul = size(z)[2]
    Eπ = 0.0 
    Eν = zeros(Ι)
    for s = 1 : Simul
        temp = f1(Cap,β,τ,z[:,s])./Simul
        Eπ += temp[1]
        Eν += temp[2]
    end  
    func = Eπ .- α*sum(Cap) 
    foc = Eν .- α
    
    return func, foc

end    

Test_Vec = 100 .*rand(Ι)

@btime f1(Test_Vec,β,τ,z)

@btime f2(Test_Vec,β,τ,z)

cvanaret · January 21, 2022, 2:15pm

Am I right when I say that all the LPs share the same feasible set, but have different objective functions? In that case, you could use as initial point of the LP a point that you know is feasible (it may be costly for the solver to determine a feasible point).
I would compute one LP at the beginning, figure out a feasible point (for example the solution of the LP), then solve all the other LPs in parallel starting from this feasible point.

jmcastro2109 · January 21, 2022, 2:22pm

Hi @cvanaret, actually what varies across simulations s is not the objective function, but the first set of constraints, i.e the z[i] in the expression below.

for i=1:Ι, j=1:J
		@constraint(opt, τ[j,i]/z[i] + τ[j,i]*x[i] - t[j] >= 0)
end

So I think I cannot apply your suggestion, unless I pass the solution for the minimum of z in each dimension, which might be worth trying.

cvanaret · January 21, 2022, 2:44pm

Ok sorry for my mistake. In this case, I would apply my suggestion to the dual problem, a (usually equivalent) LP where the variables and constraints are exchanged. That way, the feasible set is fixed and you can exploit warmstart (starting from a known initial point).

jmcastro2109 · January 21, 2022, 3:24pm

Another thing that I see is that this loop is responsible for about 40% of the computation time. Is there any way to speed this up?

for i=1:Ι, j=1:J
		@constraint(opt, τ[j,i]/z[i] + τ[j,i]*x[i] - t[j] >= 0)
end

jmcastro2109 · January 22, 2022, 9:48pm

any ideas @odow

odow · January 23, 2022, 8:11pm

I haven’t tested this, but:

Add a new variable:

@variable(model, z_inv[1:I])
for i=1:Ι, j=1:J
    @constraint(opt, τ[j,i] * z_inv[i] + τ[j,i]*x[i] - t[j] >= 0)
end

And then instead of constructing a new model every time, just fix the bounds of z_inv:

for i in 1:I
    fix(z_inv[i], 1 / z[i]; force = true)
end

jmcastro2109 · January 23, 2022, 8:55pm

@odow thanks so much! It takes about the same time. If you have any other idea it would really appreciate it.

jmcastro2109 · January 28, 2022, 1:01pm

Are there any other suggestions on improving the performance of this loop of constraints?

for i=1:Ι, j=1:J
		@constraint(opt, τ[j,i]/z[i] + τ[j,i]*x[i] - t[j] >= 0)
end

mtanneau · January 29, 2022, 9:12pm

I think what odow suggested was that you create one JuMP model that you re-use in the inner loop that’s in your f2. By doing this, you avoid re-creating the model and that quadratic-long loop that’s taking a big part of your time.

Things would look something like

function f2(...)
    ...  # prepare your data (no changes)
    
    # Build the model only once!
    model = build_conic_model(...)  # <-- this would use odow's trick
    z_inv = model[:z_inv]
    
    # /!\ to execute this loop in parallel, you'd need as many models as you have threads
    for s in 1:Simul
        # update the model by updating the value of 1/z[i]
        for i in 1:I
            fix(z_inv[i], 1 / z[i]; force = true)
        end
        optimize!(model)
        ...  # record objective value and solution
    end
    ...  # post-simulation stuff

    return ...
end

In the snippet above, the build_conic_model function would do almost the same as f1, except you only build the model, i.e., you don’t call optimize! yet.

jmcastro2109 · January 30, 2022, 3:49pm

Thanks so much @mtanneau ! I’ll try this. Just a quick follow up, to parallelize that inner loop. why would I need as many models as threads? Would it work if Simul > the number of threads?

mtanneau · January 30, 2022, 4:09pm

The original code your wrote creates a new problem from scratch, solves it, and return the solution.

Odow’s trick consists in keeping a single JuMP model in memory throughout the simulation loop (where you iterate over s). You cannot just @threads that anymore, because every thread would be mutating the same JuMP object.

Therefore, if you want to have, say 8 threads solving 8 problems in parallel (one per thread), you need 8 different JuMP objects (also one per thread).

jmcastro2109 · January 30, 2022, 4:22pm

I see. Is there any way out?

mtanneau · January 30, 2022, 4:48pm

There’s no fundamental issue here: if you want to solve X problems in parallel, you need to have X problems (and as many JuMP models) in the first place. Note that using X threads will increase your memory requirements by a factor of about X.

This is not specific to JuMP, it’s a common thing for multi-threaded applications. I believe the coined term is race condition: you cannot have two different threads execute two different mutating operations on the same object at the same time.

Topic		Replies	Views
Running JuMP using Ipopt in Parallel Optimization (Mathematical)	13	3806	February 19, 2018
Improve performance of a code with very large % of Garbage Collection Optimization (Mathematical) jump , performance , garbage-collection , nlopt	19	1604	February 1, 2022
Parallel model generation in JuMP Optimization (Mathematical) jump	1	917	March 30, 2017
Parallel pre-calculation of optimization model Optimization (Mathematical) jump	8	727	December 5, 2021
Slow generation of a large LP model in JuMP Optimization (Mathematical)	11	3879	April 24, 2018

Performance of Computation of Expectation of slow function

Related topics