After using JuMP mostly for smaller, nonlinear problems, I did some testing for larger LP/MILP models, and noticed that one can easily get a performance hit in problem creation when there are many sparse variables (a common use case for us at work), see demonstration of the issue below.
While this is fairly straightforward to work around for constraints, I found the creation of sparse variables a little cumbersome, and started playing around with some helper functions to ease creation of sparse variables or work with variables defined over tuples.
I would love to get advice on the best/most elegant way to do this, or alternatively suggestions for improvements on the ones I started working on.
To elaborate on the performance issue, consider the following example. Maybe not the best terminology, but “naive” is creating variables over the full domain with conditional, while “sparse” is creating only the needed variables efficiently:
using BenchmarkTools
using JuMP
using SparseHelper
function test_naive(N=10)
m = Model()
ts = [(1,1,1),(N,N,N)]
valid = Dict()
for t in ts
valid[t] = true
end
@variable(m,x[i=1:N,j=1:N,k=1:N ; haskey(valid,(i,j,k))])
return m
end
function test_sparse(N=10)
m = Model()
ts = [(1,1,1),(N,N,N)]
I,J,K = sparsehelper(ts,3)
@variable(m,x[i=I,j=J[i],k=K[i,j]])
return m
end
function test_both()
Ns = range(10,step=10,stop=200)
naive = []
sparse = []
for n in Ns
ntime = @belapsed test_naive($n)
push!(naive,ntime)
stime = @belapsed test_sparse($n)
push!(sparse,stime)
end
return Ns,naive,sparse
end
N,naive,sparse = test_both()
using Plots
plot(N,naive,label="Naive")
plot!(N,sparse,label="Sparse")