Working efficiently with sparse variables in JuMP

hellemo · December 13, 2019, 6:30pm

After using JuMP mostly for smaller, nonlinear problems, I did some testing for larger LP/MILP models, and noticed that one can easily get a performance hit in problem creation when there are many sparse variables (a common use case for us at work), see demonstration of the issue below.

While this is fairly straightforward to work around for constraints, I found the creation of sparse variables a little cumbersome, and started playing around with some helper functions to ease creation of sparse variables or work with variables defined over tuples.

I would love to get advice on the best/most elegant way to do this, or alternatively suggestions for improvements on the ones I started working on.

To elaborate on the performance issue, consider the following example. Maybe not the best terminology, but “naive” is creating variables over the full domain with conditional, while “sparse” is creating only the needed variables efficiently:

using BenchmarkTools
using JuMP
using SparseHelper

function test_naive(N=10)
    m = Model()
    ts = [(1,1,1),(N,N,N)]
    valid = Dict()
    for t in ts
        valid[t] = true
    end
    @variable(m,x[i=1:N,j=1:N,k=1:N ; haskey(valid,(i,j,k))])
    return m
end

function test_sparse(N=10)
    m = Model()
    ts = [(1,1,1),(N,N,N)]
    I,J,K = sparsehelper(ts,3)
    @variable(m,x[i=I,j=J[i],k=K[i,j]])
    return m
end

function test_both()
    Ns = range(10,step=10,stop=200)
    naive = []
    sparse = []
    for n in Ns
        ntime = @belapsed test_naive($n)
        push!(naive,ntime)
        stime = @belapsed test_sparse($n)
        push!(sparse,stime)
    end
    return Ns,naive,sparse
end

N,naive,sparse = test_both()

using Plots
plot(N,naive,label="Naive") 
plot!(N,sparse,label="Sparse")

sparse

odow · December 13, 2019, 7:46pm

We’re aware of the issue, and your work-around is our suggested one. If you search Discourse you can probably find similar questions.

@variable(model, x[i=1:N, j=1:N; cond(i, j)])

effectively expands to

for i = 1:N
    for j = 1:N
        if cond(i, j)
            add_variable(model)
        end
    end
end

The conditional is at the bottom-most loop, which is why you’re getting polynomial behavior.

If you have time, the documentation could be expanded to warn about potential performance implications, along with the suggested work-around (pre-process the conditional to create lists of tuples): Variables · JuMP

hellemo · December 14, 2019, 10:48am

Thanks, I agree this should be documented, people might get the wrong impression of the capabilities of JuMP otherwise. I’ll give the documentation a shot when I get the time.

Are there any plans to add more convenient syntax for this, e.g. iteratively creating variables? The need to work with tuples or construct the necessary dictionaries might put newcomers off, even if it’s not really that difficult. Getting the dictionaries for the sparse construction right was a bit tricky for more than two indices.

odow · December 14, 2019, 7:34pm

To clarify, our suggested work-around is the following:

using JuMP
model = Model()
A = 1:3
B = 1:3
C = 1:5
cond(a, b, c) = a + b + c == 5
S = [(a, b, c) for a in A, b in B, c in C if cond(a, b, c)]
@variable(model, x[S])
x[(1, 1, 3)]

Note that you need the additional parentheses.

In the long-term, we should investigate ways to improve the performance.

hellemo · December 28, 2019, 10:21am

Right, sorry if my last post was confusing. Suggestion for adding to docs in this PR.

The second point is more of a wish-list item. Using a list of tuples certainly works, but isn’t as “nice” with the extra tuple/parantheses. And as demonstrated in the OP it is already possible to create only the needed variables, so a more convenient syntax would be nice, I think.

The reason I suggest adding variables iteratively, is that in my use case, I will have to add all constraints iteratively anyway, and it is possible to add bounds iteratively, the only missing part is creating the variables. This would be a natural approach coming from Mosel, where you can first declare the variable space, and then create only the variables you will actually need. The “;”-syntax probably feels natural if you are used to GAMS’ “$”-notation, but for variable creation in this use case it is a bit awkward, as there isn’t really a condition to be evaluated, it is typically just that you use the indices for which you have data (read from csv or a database).

miles.lubin · December 28, 2019, 2:00pm

As I commented in the PR, another alternative is to create your own dictionary:

x = Dict()
for it in complex_list()
   if complex_condition(it)
     i, j, v = complex_indices(it)
     x[i, j, v] = @variable(model)
     # Optional: if you care about pretty printing.
     set_name(x[i, j, v], "x[$i, $j, $v]")
   end
end

This seems analogous to “first declare the variable space, and then create only the variables you will actually need” and doesn’t require any extra syntax from JuMP.

hellemo · December 28, 2019, 6:09pm

Thanks, that’s very close to what I’m looking for. However it doesn’t seem to give feature parity with the technique in the OP, as the variables are not registered with the model. Is there a way around that?

model=Model()
N = 10
S = [(1,1,1),(N,N,N)]
# slow as it evaluates conditional N^3 times:
@variable(model, x1[i=1:N,j=1:N,k=1:N; (i,j,k) in S]) 
# fast (linear in dimension of S):
@variable(model, x2[S])
# avoid using tuples by constructing dictionary (also fast):
x3 = Dict()
for (i,j,k) in S
    x3[i,j,k]  = @variable(model)
    # Optional, if you care about pretty printing:
    set_name(x3[i,j,k], "x[$i,$j,$k]")
end
show(model)

A JuMP Model Feasibility problem with: Variables: 6 Model mode: AUTOMATIC CachingOptimizer state: NO_OPTIMIZER Solver name: No optimizer attached. Names registered in the model: x1, x2

miles.lubin · December 28, 2019, 7:45pm

Yes, based on the example here you can do as follows:

x3 = model[:x3] = Dict()
for (i,j,k) in S
    x3[i,j,k]  = @variable(model)
    # Optional, if you care about pretty printing:
    set_name(x3[i,j,k], "x[$i,$j,$k]")
end

The semantic difference compared with the @variable constructor is that model[:x3] = Dict() won’t throw an error if there’s an existing object named :x3. Maybe we should have a “safer” interface for this.

hellemo · January 6, 2020, 8:42pm

Great! That should work, thanks a lot!

Perhaps a bit ironic if this is basically to bypass the error that I ran into in my very first attempt to do this with @variable, though. While I do understand that creating variable x would be considered an error if you already have x,y, and z defined, I wouldn’t consider creating x_{1,2} an error just because x_{1,1} is already defined. Maybe checking at that level of granularity is not practical in JuMP, perhaps optionally disabling that test would make sense?

miles.lubin · January 6, 2020, 9:14pm

Using anonymous variables is precisely what’s recommended if you don’t want an error when creating two variables with the same symbolic name.

You should read @variable(m, x[S]) as creating a set of variables named x indexed over S. You can’t have two sets of variables with the same name (as far as JuMP is aware of the name), and the index sets must be concretely specified when the variables are created so that JuMP can create x using the correct container type.

hellemo · January 7, 2020, 10:12am

I see, the point I was trying to make was that from a user point of view, it’s not a logical error to add another variable with the same name but different index, so it might be a bit puzzling when this generates an error (for implementation reasons).

I’ll stop whining about this, there are good work-arounds, but the bottom line is that this is a rather common use-case in my experience, so having a performant and easy syntax in the vein of @variable(model, x[i,j,k] for (i,j,k) in S) would be even better (not considering how difficult that would be to implement).

Topic		Replies	Views
How to extend the JuMP @variable so that it can construct a sparse array which can be used as a full matrix? Optimization (Mathematical) question , jump	1	464	October 3, 2021
Behaviour of sparse array variables in JuMP Optimization (Mathematical) question , jump	6	1203	March 19, 2020
Can JuMP.jl create a sparse array variable that can be used as a full matrix? Optimization (Mathematical) question , jump	2	756	October 3, 2021
Creating a new variable in JuMP with other variables already existing Optimization (Mathematical) jump	4	655	November 12, 2022
How to create a huge set of variables (500k or more) in JUMP Optimization (Mathematical) jump	4	799	April 15, 2018

Working efficiently with sparse variables in JuMP

Related topics