JuMP "getvalue" suddenly very slow

I’m using JuMP with a fairly large model (the largest component is a dict of @NLexpressions of size 10,254,747).

Usually this works fine, and JuMP performs well. Recently, I’ve tried to do a ‘second stage’ optimization in which I extract the optimal values of these expressions, do some simple computations with them and create a new objective function using the computed values.

When I try to extract the expressions at the optimum from the first run, however, the program seems to take forever or get stuck or something. After that (and after restarting julia and my computer entirely) getvalue() in general seems to be super super slow. Things that used to take milliseconds take multiple seconds and it adds up.

Is there some caching issue that could be causing this? There must be something…

The code I’m using is attached. It’s long and has lots of components, so I’ll highlight the important parts:
The @NLexpression I was referring to is defined on lines 152-156

@NLexpression(
        m,
        nu[auc in auction_arr, n in 1:auc.num_bidders,  t=1:auc.T],
        (auc.bidders[n].b_data[t] - (alpha[auc,n]*auc.optb_denom[t]) - (auc.optb_num_main[t]*inv_gamma[auc.project_type_id]) - auc.optb_num_coeff[t]*auc.bidders[n].score)
)

My attempt to extract it is on lines 269-272:

nu_dict = Dict{Any, Any}()
for auc in auction_arr, n in 1:auc.num_bidders, t=1:auc.T
    nu_dict[auc.bidders[n]] = [getvalue(nu[auc, n ,t]) for t = 1:auc.T]
end

I realize that my code isn’t written to maximally optimize speed or RAM, but I’m surprised that it freezes julia entirely. And I’m extra confused as to why the effect seems to persist after I restart julia… Any help would very much appreciated.

UPDATE: Upon having written this all out, I realize that it is a silly pursuit to extract my ‘NLexpression’ rather than just reconstruct it with the parameters that I’m actually estimating. Face -> Palm.

That said, my getvalue() is performing EXCEEDINGLY slowly – without doing this step at all – and I have no idea why :confused:

Full script below (sorry it’s a bit messy)

using JuMP
using Ipopt
using DataFrames, CSV
using Parameters
# using Distributions

type Bidder
    bidder_id::String
    b_data::Array{Float64}
    score::Float64
    est_nu::Array{Float64}
end

type MinimalAuction
    contract_no::Int64
    project_type_id::Int64
    num_bidders::Int64
    T::Int64
    bidders::Array{Bidder}
    optb_denom::Array{Float64}
    optb_num_coeff::Array{Float64}
    optb_num_main::Array{Float64}
    q_a::Array{Float64}
    q_a_model::Array{Float64}
    q_o::Array{Float64}
    c::Array{Float64}
    item_ids::Array{Int64}
end

type Item_Instance
    contract_no::Int64 ## maps auctions the item is included in to the item's position
    seq_item_no::Int64
    auction::MinimalAuction
    num_obs::Int64
end

function constructMinimalAuction(auction_data::DataFrames.DataFrame, bidder_data::DataFrames.DataFrame, dollar_scale::Int64, num_sims::Int64)

    contract_no = auction_data[:contract_no][1]
    project_type_id = auction_data[:project_type_id][1]

    q_a_model = convert(Array{Float64},auction_data[:q_at_model])
    q_e = convert(Array{Float64},auction_data[:q_ot])
    q_a = convert(Array{Float64},auction_data[:q_at])
    c_o = convert(Array{Float64},auction_data[:office_unit_price]/dollar_scale)
    sigma_t = convert(Array{Float64},auction_data[:sigma_t])
    
    item_ids = convert(Array{Float64},bidder_data[:item_id_sequential])
    
    T = length(q_e)
    sigma_sq = sigma_t .* sigma_t
    
    sum_qesq_sigsq = sum((q_e[t]^2) / sigma_sq[t] for t in 1:T)
    sum_qe_co = sum((q_e[t]*c_o[t]) for t in 1:T)
    sum_qe_qa_sigsq = sum((q_e[t] * q_a_model[t]) / sigma_sq[t] for t in 1:T)

    denom = [c_o[t] - (q_e[t]/(sigma_sq[t]*sum_qesq_sigsq))*sum_qe_co for t in 1:T]
    num_coeff = [q_e[t]/(sigma_sq[t]*sum_qesq_sigsq) for t in 1:T]
    num_main = [(q_a_model[t]/sigma_sq[t]) - (num_coeff[t]*sum_qe_qa_sigsq) for t in 1:T]

    num_bidders = ncol(bidder_data) - 2 ## Accounting for new 'generic item ids'

    bidders = [Bidder(names(bidder_data)[i],
            convert(Array{Float64},bidder_data[:,i]/dollar_scale),
            (((bidder_data[:,i])' * q_e) /dollar_scale),
             zeros(T)) for i = 3:(num_bidders+2)]

    bidders = sort!(bidders, by = x -> x.score)


    return MinimalAuction(contract_no, project_type_id, num_bidders, T, bidders, denom, num_coeff, num_main, q_a, q_a_model, q_e, c_o, item_ids)

end
# Note: Only Bridge contracts here
contract_list = [ 30163, 30230, 31050, 31063, 31082, 31084, 31172, 31200, 31267, 31292, 31296, 31348, 31353, 31356, 31358, 31359, 32029, 32045, 32052]

num_contracts = length(contract_list)
auction_arr = Vector{MinimalAuction}(num_contracts)
item_dict = Dict{Int64, Any}()
item_dict_length = Dict{Int64, Int64}()
auc_arr_contract_nos = Vector{Int64}(num_contracts)
project_type_dict = Dict{Int64,Int64}() #dictionary mapping project type ids to the number of contracts of that type

for i = 1:num_contracts
    contract = contract_list[i]
    example_project = CSV.read("../data/september_bridge_auctions/sample_project_$contract.csv")
    example_bidders = CSV.read("../data/september_bridge_auctions/sample_project_bidders_$contract.csv")


    auction_arr[i] = constructMinimalAuction(example_project, example_bidders, 1000, 1)
    auc_arr_contract_nos[i] = auction_arr[i].contract_no
    auc_project_type_id = auction_arr[i].project_type_id
    
    try
    project_type_dict[auc_project_type_id] = project_type_dict[auc_project_type_id] + 1
    catch error
       if isa(error, KeyError)
        # project type id not found
        project_type_dict[auc_project_type_id] = 1
       end
    end
    
    for t in 1:auction_arr[i].T
        item_id = auction_arr[i].item_ids[t]
        try
            push!(item_dict[item_id], Item_Instance(auction_arr[i].contract_no , t, auction_arr[i], (auction_arr[i].num_bidders) ))
            catch error
               if isa(error, KeyError)
#                    println("No value for item_dict[$item_id]")
                    item_dict[item_id] = Set([Item_Instance(auction_arr[i].contract_no , t, auction_arr[i], (auction_arr[i].num_bidders))])
               end
            end
        end
end

item_counts = [auc.T for auc in auction_arr]
max_T = maximum(item_counts)
bidder_counts = [auc.num_bidders for auc in auction_arr]
max_num_bidders = maximum(bidder_counts)

item_numobs_dict = Dict{Int64, Int64}()

for item_id in keys(item_dict)
    num_item_obs = 0
    for item_inst in item_dict[item_id]
        num_item_obs = num_item_obs + item_inst.num_obs
    end
    item_numobs_dict[item_id] = num_item_obs
end

num_unique_items = length(keys(item_dict))
num_auctions = length(auction_arr)

println("There are: ", num_auctions, " auctions here.")
num_nus = sum( (sum(t for t=1:auc.T) * auc.num_bidders) for auc in auction_arr)

## JuMP Model
solver = IpoptSolver()
m = Model(solver=solver)

@variables m begin
    inv_gamma[pt in keys(project_type_dict)], (start = 10)
    alpha[auc in auction_arr, n in 1:auc.num_bidders], (start = 0.95)
end

@NLexpression(
        m,
        nu[auc in auction_arr, n in 1:auc.num_bidders,  t=1:auc.T],
        (auc.bidders[n].b_data[t] - (alpha[auc,n]*auc.optb_denom[t]) - (auc.optb_num_main[t]*inv_gamma[auc.project_type_id]) - auc.optb_num_coeff[t]*auc.bidders[n].score)
)

@NLexpression(
        m,
        moment1,
        sum( sum( (sum((nu[item_inst.auction, n, item_inst.seq_item_no]) for n in 1:item_inst.auction.num_bidders)/item_inst.auction.num_bidders) for item_inst in item_dict[item])^2 / item_numobs_dict[item] for item in keys(item_dict))/num_unique_items
        )
		
@NLexpression(
        m,
        moment2,
		sum( sum((nu[auc,n,t]) * auc.q_o[t] for t in 1:auc.T)^2 / (auc.T * auc.num_bidders) for auc in auction_arr, n in 1:auc.num_bidders )/num_auctions        
		)

@NLexpression(
        m,
        moment3,
		sum( sum((nu[auc,n,t]) * auc.q_a[t] for t in 1:auc.T)^2 / (auc.T * auc.num_bidders) for auc in auction_arr, n in 1:auc.num_bidders )/num_auctions         
		)
		
@NLexpression(
        m,
        moment4,
        sum( (nu[auc,n,t]) for auc in auction_arr, n=1:auc.num_bidders, t = 1:auc.T )^2/num_nus
        )

@NLobjective(
            m,
            Min,
            moment1 +
            moment2 +
            moment3 +
			moment4
    )

status = JuMP.solve(m)

## See results

answer = getobjectivevalue(m)
println("answer: ", answer)

for pt in keys(project_type_dict)
    gamma_min = 1.0/getvalue(inv_gamma[pt])
    println("project type: ", pt, "; gamma: ", gamma_min)

end


## One test for how sensible an answer is, is what "markup" it implies for the bidder. 
## Anything with absolute value < 1 (meaning less than 100% markup or loss) is reasonable-ish
## Costs are defined as alpha * c (c is a base cost vector) so if alpha \approx 0, this model is crazy

function getBidderMarkup(b, q_a, c, alpha)
    bidder_cost = (alpha*c)' * q_a
    bidder_rev = b' * q_a
    
    markup = (bidder_rev - bidder_cost)/bidder_cost

    return (bidder_cost, bidder_rev, markup)
end

mkps = [getBidderMarkup(bidder_bids[n], auction.q_a, auction.c, alpha_min[n]) for n=1:I]
for auc in auction_arr, n=1:auc.num_bidders
     alpha_min = getvalue(alpha[auc,n])
     mkp = getBidderMarkup(auc.bidders[n].b_data, auc.q_a, auc.c, alpha_min)
     println("Contract: ", auc.contract_no, "Bidder: ", n)
     println(n, ": ", mkp)
 end

N_I = 0
for auc in auction_arr, i in range(1,auc.num_bidders)
    alpha_min = getvalue(alpha[auc,i])
    println("alpha: ", alpha_min)

    N_I = N_I + 1
end

winning_mkps = [1.2 for i in 1:length(auction_arr)]
num_bidders = [0 for i in 1:length(auction_arr)]

alphas = [1.2 for i = 1:N_I]
gammas = [1.2 for i = 1:N_I]
mkps = [1.2 for i = 1:N_I]
num_bidders = [0 for i = 1:N_I]
contract_nos = [0 for i = 1:N_I]
bidder_ids = ["a" for i = 1:N_I]


i = 0
for auc in auction_arr, n in 1:auc.num_bidders
    i= i+1
    alpha_min = getvalue(alpha[auc,n])
    mkp_tuple = getBidderMarkup(auc.bidders[n].b_data, auc.q_a, auc.c, alpha_min)
    gamma_min = 1.0/getvalue(inv_gamma[auc.project_type_id])

    alphas[i] = alpha_min
    mkps[i] = mkp_tuple[3]
    num_bidders[i] = auc.num_bidders
    contract_nos[i] = auc.contract_no
    bidder_ids[i] = auc.bidders[n].bidder_id
    gammas[i] = gamma_min
end

mkp_df = DataFrame(gamma = gammas, c_n_i = alphas, mkp = mkps, num_bids = num_bidders, contract_no = contract_nos, bidder_id = bidder_ids)

CSV.write("../output/bridge_mkps_gmm_ipopt_sigmamodel_multigamma_unweightedgmm_08292018_try.csv", mkp_df)


### Weighting
nu_dict = Dict{Any, Any}()
for auc in auction_arr, n in 1:auc.num_bidders, t=1:auc.T
    nu_dict[auc.bidders[n]] = [getvalue(nu[auc, n ,t]) for t = 1:auc.T]
end

item_nuvar_dict = Dict{Int64, Float64}()

for item in (keys(item_dict))
    innersum_sq = 0
    outersum_sq = 0
    for item_inst in item_dict[item]
        innersum = 0
        for n in 1:item_inst.auction.num_bidders
            innersum = innersum + nu_dict[item_inst.auction.bidders[n]][item_inst.seq_item_no]
        end
        innersum_sq = innersum_sq + (innersum^2/item_inst.auction.num_bidders)
        outersum_sq = outersum_sq + innersum/item_inst.auction.num_bidders
    end
    outersum_sq = outersum_sq^2
    var = innersum_sq - outersum_sq
#     println("var: ", var)
    item_nuvar_dict[item] = var
#     println("inner sum: ", innersum_sq, " outer sum: ", outersum_sq)
end

bidder_mom2_dict = Dict{Any, Any}()
bidder_mom3_dict = Dict{Any, Any}()

bidder_mom2_dict = Dict{Any, Any}()
bidder_mom3_dict = Dict{Any, Any}()

innersum = 0.0
outersum = 0.0
for auc in auction_arr, n in 1:auc.num_bidders
    innersum2 = Base.sum( ( nu_dict[auc.bidders[n]][t] * auc.q_o[t])^2 for t = 1:auc.T)/ auc.T
    outersum2 = Base.sum( nu_dict[auc.bidders[n]][t] * auc.q_o[t] for t = 1:auc.T)^2/ auc.T
    
    innersum3 = Base.sum( (nu_dict[auc.bidders[n]][t] * auc.q_a[t])^2 for t = 1:auc.T)/ auc.T
    outersum3 = Base.sum( nu_dict[auc.bidders[n]][t] * auc.q_a[t] for t = 1:auc.T)^2/ auc.T
    
    var2 = innersum2 - outersum2
    var3 = innersum3 - outersum3

#     println("var3: ", var3)
    bidder_mom2_dict[auc.bidders[n]] = var2
    bidder_mom3_dict[auc.bidders[n]] = var3

end


var4 = sum( (nu_dict[auc.bidders[n]][t]^2) for auc in auction_arr, n=1:auc.num_bidders, t = 1:auc.T ) - sum( (nu_dict[auc.bidders[n]][t]) for auc in auction_arr, n=1:auc.num_bidders, t = 1:auc.T )^2

@NLexpression(
        m,
        moments1_weight,
        sum( 
            (sum((nu[item_inst.auction, n, item_inst.seq_item_no]) for item_inst in item_dict[item], n in 1:item_inst.auction.num_bidders)^2)/
            item_nuvar_dict[item] for item in keys(item_dict))/num_unique_items
        )
		
@NLexpression(
        m,
        moments2_weight,
        sum(
            sum( 
                (sum(nu[auc,n,t] * auc.q_o[t] / auc.T for t in 1:auc.T)^2)/bidder_mom2_dict[auc.bidders[n]] for n in 1:auc.num_bidders ) / auc.num_bidders 
            for auc in auction_arr)/num_auctions
        )
		
@NLexpression(
        m,
        moments3_weight,
        sum(
            sum( 
                (sum(nu[auc,n,t] * auc.q_a[t] / auc.T for t in 1:auc.T)^2)/bidder_mom3_dict[auc.bidders[n]] for n in 1:auc.num_bidders ) / auc.num_bidders 
            for auc in auction_arr)/num_auctions
        )
		
@NLexpression(
        m,
        moments4_weight,
        sum( (nu[auc,n,t]/var4) for auc in auction_arr, n=1:auc.num_bidders, t = 1:auc.T )^2
        )


@NLobjective(
            m,
            Min,
            moments1_weight +
            moments2_weight +
            moments3_weight +
			moments4_weight
    )
	
status = JuMP.solve(m)

answer = getobjectivevalue(m)
println("answer: ", answer)

for pt in keys(project_type_dict)
    gamma_min = 1.0/getvalue(inv_gamma[pt])
    println("project type: ", pt, "; gamma: ", gamma_min)

end

winning_mkps = [1.2 for i in 1:length(auction_arr)]
num_bidders = [0 for i in 1:length(auction_arr)]

alphas = [1.2 for i = 1:N_I]
gammas = [1.2 for i = 1:N_I]
mkps = [1.2 for i = 1:N_I]
num_bidders = [0 for i = 1:N_I]
contract_nos = [0 for i = 1:N_I]
bidder_ids = ["a" for i = 1:N_I]


i = 0
for auc in auction_arr, n in 1:auc.num_bidders
    i= i+1
    alpha_min = getvalue(alpha[auc,n])
    mkp_tuple = getBidderMarkup(auc.bidders[n].b_data, auc.q_a, auc.c, alpha_min)
    gamma_min = 1.0/getvalue(inv_gamma[auc.project_type_id])

    alphas[i] = alpha_min
    mkps[i] = mkp_tuple[3]
    num_bidders[i] = auc.num_bidders
    contract_nos[i] = auc.contract_no
    bidder_ids[i] = auc.bidders[n].bidder_id
    gammas[i] = gamma_min
end

mkp_df = DataFrame(gamma = gammas, c_n_i = alphas, mkp = mkps, num_bids = num_bidders, contract_no = contract_nos, bidder_id = bidder_ids)

CSV.write("../output/bridge_mkps_gmm_ipopt_sigmamodel_multigamma_weightedgmm_08292018_try.csv", mkp_df)

Upon having written this all out, I realize that it is a silly pursuit to extract my ‘NLexpression’ rather than just reconstruct it with the parameters that I’m actually estimating. Face → Palm.

That said, my getvalue() is performing EXCEEDINGLY slowly – without doing this step at all – and I have no idea why :confused:

Okay, I think I understand what’s going on. It seems that getvalue() for expressions does some very heavy stuff – computing, storing, etc. all of which is really, really, really inefficient.

Therefore, every call to getvalue() for expressions takes forever and slows down the whole machine. When I eliminate all uses of this and use getvalue() only for parameters of the model, everything seems to run like it’s supposed to. whew!

1 Like