JuMP : Solving an optimisation problem with a large matrix of variables

Hi all

I have an optimisation of the following form.

# Size
rows = 6000
cols = 10

# Create model with solver
model = Model(with_optimizer(GLPK.Optimizer))

# Create binary variables
@variable(model, X[1:rows, 1:cols], Bin)

# Objective - maximise V
V = rand(rows, cols)
@objective(model, Max, sum(X .* V));

# Constraint - Each row must only have one entry
for i in 1:rows
    @constraint(model, sum(X[i, :]) <= 1)

# Constraint - Each column sums to between min and max
minvols = rand(200:400, rows)
maxvols = rand(600:800, cols)
for j in 1:cols
    @constraint(model, minvols[j] <= sum(X[:, j]) <= maxvols[j])

# Solve
@time optimize!(model)

My problem is that when the data set gets much larger - e.g. if rows is around 500,000 - the solver takes far too long.

Does anyone have any advice on how to approach a problem like this?

1 Like

With rows = 500_000 and cols = 10, you have 5 million binary variables. That’s a tricky problem.

You could try Cbc.jl, or, if you are an academic, a commercial solver like CPLEX.jl.

I solve problems this big on a regular basis for my job. It is notoriously difficult to predict how long it will take to solve such problems, and they are NP-hard in the general case. We use Gurobi which is excellent, though it is a commercial solver. The open source options are disappointing, but Cbc is probably worth a shot (you may have to do a lot of tweaking of it).

It’s also usually helpful to identify which step is taking a long time: is it the root relaxation, or branch and bound? Sometimes it’s even the generation of cutting planes before the branch and bound step that takes a long time.


Also, just couldn’t help myself, one of the things I love about JuMP and Julia is how easy it is to write efficient problems that look just like they do on paper:

const N = 5*10^5
const M = 10

m = Model()

𝟏 = ones(N)
𝟙 = ones(M)

@variable(m, X[1:N, 1:M], Bin)

@constraint(m, X*𝟙 .≤ 1)

mₗ = rand(200:400, M)
mᵤ = rand(600:800, M)
@constraint(m, mₗ .≤ X'*𝟏 .≤ mᵤ)

V = rand(N, M)
@objective(m, Max, sum(X .* V))

This benefit really adds up when you have big complicated problems, and can easily read them.


Thank you all! much appreciated.

I did find that Cbc was quicker. And I also found that the relaxed problem is much quicker and often equal to the binary problem. I did this by modifying the variables to:

@variable(model, 0 <= X[1:rows, 1:cols] <= 1)

I see this used to be possible pre v0.19 by setting relaxation=True within the solve function. And there is a feature request: