I need to decide between using JUMP and Python-MIP for a large MILP. Python-MIP has some impressive benchmarks on its website, and provides more features such as lazy constraints and call-backs with CBC solver. But here I am more interested in speed than features.
So far, I have only tried JUMP and I wonder why model generation is slow on a particular MILP- it takes more time to generate the model than for the solver to solve it. I have not put together a full example, but a basic simple model setting out only the type of constraints which are leading to a performance bottleneck. On the actual problem where the dataset is much larger, slow performance is noticeable.
The input data is contained in various CSV files and some user inputs, so the data()
function generates data and saves it in dataframes.
The constraints should be easy to follow from the code and comments; otherwise, I can provide more explanation.
The simple_model(…)
function already takes quite a few arguments and on the actual problem number of arguments for this function and main()
function is much higher (running over 4 lines!). This is largely due to my inexperience with Julia.
Any help with improving the code for speed and arguments to functions will be much appreciated.
using DataFrames, JuMP, Random
function main()
df1, df2, controls_min_expenses, controls_max_allowance, exclude_vec = data() # All these will be data files as arguments to the main function
model = simple_model(df1, df2, controls_min_expenses, controls_max_allowance, exclude_vec)
print(model) # or optimize model
end
function simple_model(df1, df2, controls_min_expenses, controls_max_allowance, exclude_vec)
model = Model()
@variable(model,x[i=1:nrow(df1)]>=0) #most variables are binary in the actual problem
# Constraint for minimum expense limit: sum(x * expenses where PARENT is P1 or P5)/sum(total expenses where PARENT is P1 or P5) >= min expneses in controls_min_expenses.LIMIT
for i in 1:nrow(controls_min_expenses)
index_parent_limit = findall(x -> x == controls_min_expenses.PARENT[i] , df1.PARENT)
id_parent_limt_index_df1 = df1.IDENTITY[index_parent_limit]
parent_index = findall(x -> x ∈ id_parent_limt_index_df1 , df1.IDENTITY)
@constraint(model,sum(x[j] *df1.EXPENSES[j] for j ∈ parent_index) >= sum(df1[!,:EXPENSES][j] for j ∈ parent_index) * controls_min_expenses.LIMIT[i])
end
# Constraint for ensuring every element in a group are equal
get_GROUP = unique(df2.GROUP)
@variable(model,group_restrict[1:length(get_GROUP)])
for i in 1:length(get_GROUP)
index = findall(x -> x == get_GROUP[i] , df2.GROUP)
id = df2.IDENTITY[index]
id_index_df1 = findall(x -> x ∈ id , df1.IDENTITY)
@constraint(model,[j ∈ id_index_df1],x[j] - group_restrict[i] == 0)
end
# Constraint to exclude if CHILD is C1 or C5 (as given by the vector exclude_vec in data()), and it is not part of a GROUP
for i in 1:length(exclude_vec)
id_temp2 = filter(x -> x.CHILD == exclude_vec[i],df1).IDENTITY
for j in eachindex(id_temp2)
if isempty(findall(x -> x == id_temp2[j] , df2.IDENTITY))
index_df1 = findall(x -> x == id_temp2[j] , df1.IDENTITY)
fix.(x[index_df1], 0; force = true)
end
end
end
return model
end
function data()
Random.seed!( 0 )
df1 = DataFrame(INDEX = collect(1:10), IDENTITY = string.("ID",collect(1:10)), NAME = randstring.(rand(5:10,10)),
PARENT = ["P1","P1","P1","P2","P3","P4","P3","P5","P5","P6"],CHILD = ["C1","C1","C2","C2","C3","C3","C4","C5","C5","C5"],
GRAND_CHILD = ["GC1","GC1","GC1","GC2","GC3","GC3","GC4","GC4","GC5","GC6"], EXPENSES = rand(10), ALLOWANCE = rand(10))
# Data for groupings
df2 = DataFrame(IDENTITY = [df1.IDENTITY[1],df1.IDENTITY[3],df1.IDENTITY[4],df1.IDENTITY[5],df1.IDENTITY[7],df1.IDENTITY[8],df1.IDENTITY[10]],
GROUP = ["Privileged", "Privileged","Upper","Upper","Privileged","Working","Working"])
# Data for constraints setting relative limits
controls_min_expenses = DataFrame(PARENT = ["P1","P5"], LIMIT = [0.2,0.7])
controls_max_allowance = DataFrame(GROUP = "Privilged", MAX_ALLOW = 0.8)
# Data for excluding if CHILD is C1 or C5, and they are not part of a GROUP
exclude_vec = ["C1", "C5"]
return df1, df2, controls_min_expenses, controls_max_allowance, exclude_vec
end