Slow JuMP model creation

I am trying to solve a very large (N=100M variables) and very sparse (density ~ 1/N) QUBO problem, using CPLEX interfaced via JuMP. To this end, I’m using the following function

function solve_with_CPLEX(Q)
  L = size(Q, 1)
  model = direct_model(CPLEX.Optimizer()) 
  set_time_limit_sec(model, 3600 * 3)
  set_optimizer_attribute(model, "CPXPARAM_Threads", 8)
  @variable(model, x[i ∈ collect(1:L)], Bin)
  @objective(model, Min, x.data' * triu(Q) * x.data)
  tts = @elapsed optimize!(model)
  value.(x.data), tts
end

It works well for small problems, but the time needed to just build the model (without optimize! call) grows very rapidly, see the plot below.

If extrapolation is to be believed, 100M variable problem should take over 5 months just to build via JuMP!

My question is then, am I doing something wrong, or is it a limitation of JuMP/CPLEX interface?

Hi @jakubp, welcome to the forum :smile:

How much memory do you have on your computer? What is the output of import Pkg; Pkg.status()?

When I run with a N = 10^6, I get a few seconds, which is much less than your expected 10^3 seconds from that plot.

julia> using JuMP

julia> import CPLEX

julia> import LinearAlgebra

julia> import SparseArrays

julia> function main(Q)
           L = size(Q, 1)
           model = direct_model(CPLEX.Optimizer()) 
           x = @variable(model, [1:L], Bin)
           @objective(model, Min, x' * LinearAlgebra.triu(Q) * x)
           return model
       end
main (generic function with 1 method)

julia> N = 1_000_000
1000000

julia> Q = SparseArrays.sprand(Float64, N, N, 1 / N);

julia> @time main(Q);
  3.450667 seconds (22.14 M allocations: 1.243 GiB, 32.28% gc time, 0.75% compilation time)

julia> @time main(Q);
  2.724309 seconds (22.10 M allocations: 1.241 GiB, 26.65% gc time)

Thanks for the reply :smile:
Memory is not the problem, I have around 1TB. But I have identified the culprit I think. You see I omitted one detail from my MWE. My QUBO problems come from Ising models, so the the solve_with_CPLEX function actually looks like this:

function solve_with_CPLEX(J,h)
  Q = to_qubo(J,h)
  L = size(Q, 1)
  model = direct_model(CPLEX.Optimizer()) 
  set_time_limit_sec(model, 3600 * 3)
  set_optimizer_attribute(model, "CPXPARAM_Threads", 8)
  @variable(model, x[i ∈ collect(1:L)], Bin)
  @objective(model, Min, x.data' * triu(Q) * x.data)
  tts = @elapsed optimize!(model)
  value.(x.data), tts
end

where

function to_qubo(J::AbstractMatrix{T}, h::AbstractVector{T}) where {T}
  b = dropdims(sum(J, dims = 2), dims = 2)
  Q = T(4) .* J
  Q[diagind(Q)] .+= T(2) .* (h .- b)
  Q
end

And it turns out the line Q[diagind(Q)] .+= T(2) .* (h .- b) is super slow

julia> @time Q = 4 .* J;
  0.001265 seconds (8 allocations: 8.392 MiB)

julia> @time Q[diagind(Q)] .+= Float64(2) .* (h .- b);
  6.221120 seconds (12 allocations: 6.480 MiB)

julia> @time Q = 4 .* J;
  0.001216 seconds (8 allocations: 8.392 MiB)

julia> @time Q[diagind(Q)] += Float64(2) .* (h .- b);
  0.017774 seconds (200.03 k allocations: 33.788 MiB)

Now my estimations for 100M variables problem are much more reasonable 17 minutes.
Thanks again for the help!

1 Like

Nice. Ill leave a link to