I added 1 Million constraints to an LP unwittingly

This topic is sharing an experience, probably a noob has not seen before (me included).
Put it simply, I’m solving a disjointed bilinear programming using RLT, and wasn’t realizing that the number of constraints will be (almost) prohibitive.
I used JuMP to model the problem, I guess it probably takes more than 1 hour to add the whole set of constraints. This is the first time I witness the barrier method in the context of an LP.

julia> lpr
lpr_lMD
├ solver: Gurobi
├ objective_sense: MAX_SENSE
│ └ objective_function_type: JuMP.AffExpr
├ num_variables: 17169
├ num_constraints: 1492050
│ └ JuMP.AffExpr in MOI.GreaterThan{Float64}: 1492050
└ Names registered in the model
  └ :cross_multrix, :g_x_DJ, :g_x_Dv, :lpr_DI, :lpr_DJ, :lpr_Dv, :lpr_g

julia> solve_to_normality(lpr);
Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (win64 - Windows 11.0 (26100.2))

CPU model: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, instruction set [SSE2|AVX|AVX2|AVX512]        
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 1492050 rows, 17169 columns and 13609742 nonzeros
Model fingerprint: 0x7d117878
Coefficient statistics:
  Matrix range     [3e-01, 1e+02]
  Objective range  [5e-02, 9e-01]
  Bounds range     [0e+00, 0e+00]
  RHS range        [3e-01, 5e+03]
Presolve removed 17338 rows and 0 columns
Presolve time: 11.55s
Presolved: 17169 rows, 1474881 columns, 13592573 nonzeros

Concurrent LP optimizer: primal simplex and barrier
Showing barrier log only...

Ordering time: 4.36s

Barrier statistics:
 AA' NZ     : 6.388e+07
 Factor NZ  : 1.474e+08 (roughly 1.8 GB of memory)
 Factor Ops : 1.687e+12 (roughly 10 seconds per iteration)
 Threads    : 3

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0   5.71120126e+05  0.00000000e+00  6.05e+02 0.00e+00  3.95e+00    28s
   1   4.37893245e+05 -3.45183484e+02  4.47e+02 1.30e+01  2.71e+00    28s
   2   2.67897039e+05 -4.37239984e+02  2.42e+02 1.25e+01  1.67e+00    29s
   3   1.46906844e+05 -3.67646021e+02  1.24e+02 9.69e+00  9.33e-01    29s
   4   8.86396264e+04 -1.69332027e+02  7.03e+01 6.12e+00  5.43e-01    30s
   5   4.22819423e+04  8.78804336e+01  3.07e+01 2.29e+00  2.40e-01    40s
   6   2.02877786e+04  1.68938661e+02  1.15e+01 1.08e+00  9.31e-02    50s
   7   1.22500334e+04  2.35499963e+02  6.03e+00 3.12e-01  4.86e-02    60s
   8   5.81647899e+03  3.56104339e+02  2.32e+00 4.26e-13  2.00e-02    71s
   9   2.64678975e+03  7.30204234e+02  6.48e-01 3.13e-13  6.08e-03    81s
  10   2.34347636e+03  8.45237120e+02  4.75e-01 3.13e-13  4.55e-03    92s
  11   2.57521485e+03  1.28886597e+03  3.42e-01 3.98e-13  3.51e-03   112s
  12   2.83304679e+03  1.76408230e+03  2.55e-01 3.13e-13  2.77e-03   140s
  13   2.81240430e+03  2.07655283e+03  2.22e-01 3.98e-13  2.33e-03   175s
  14   2.88608928e+03  2.21103174e+03  2.10e-01 3.41e-13  2.21e-03   206s
  15   2.96672730e+03  2.48805003e+03  1.82e-01 3.13e-13  1.88e-03   233s
  16   3.04588895e+03  2.78311597e+03  1.56e-01 3.13e-13  1.55e-03   261s
  17   3.15331798e+03  3.06080233e+03  1.31e-01 2.77e-13  1.23e-03   290s
  18   3.18703776e+03  3.09122050e+03  1.23e-01 3.20e-13  1.17e-03   316s
  19   3.25205976e+03  3.27818355e+03  1.10e-01 3.06e-13  9.81e-04   337s
  20   3.34053747e+03  3.35671306e+03  9.11e-02 2.77e-13  8.24e-04   358s
  21   3.39744592e+03  3.49154561e+03  7.98e-02 3.77e-13  6.76e-04   381s
  22   3.48352309e+03  3.59061737e+03  6.00e-02 3.06e-13  4.86e-04   408s
  23   3.53725932e+03  3.66655784e+03  4.87e-02 3.48e-13  3.68e-04   438s
  24   3.57674623e+03  3.67790503e+03  4.11e-02 3.84e-13  3.16e-04   470s
  25   3.61864454e+03  3.71529940e+03  3.29e-02 3.93e-13  2.43e-04   506s
  26   3.64873013e+03  3.73499349e+03  2.71e-02 3.29e-13  1.96e-04   539s
  27   3.68860535e+03  3.75640763e+03  1.98e-02 3.28e-13  1.39e-04   578s
  28   3.74475396e+03  3.77121455e+03  1.01e-02 3.67e-13  7.64e-05   623s
  29   3.76230460e+03  3.78202237e+03  7.06e-03 3.47e-13  5.27e-05   661s
  30   3.79956188e+03  3.79464092e+03  1.00e-03 4.19e-13  1.27e-05   689s
  31   3.80517002e+03  3.80253731e+03  1.20e-05 3.71e-13  1.88e-06   718s
  32   3.80496948e+03  3.80487918e+03  5.52e-08 4.14e-13  6.11e-08   746s
  33   3.80494594e+03  3.80494560e+03  9.26e-10 4.74e-13  2.33e-10   774s
  34   3.80494581e+03  3.80494581e+03  3.71e-09 4.12e-13  2.33e-13   808s

Barrier solved model in 34 iterations and 807.94 seconds (421.92 work units)
Optimal objective 3.80494581e+03

Crossover log...

    7121 DPushes remaining with DInf 1.9624121e-05               809s
    3512 DPushes remaining with DInf 0.0000000e+00               810s
     505 DPushes remaining with DInf 0.0000000e+00               816s
     285 DPushes remaining with DInf 0.0000000e+00               821s
       0 DPushes remaining with DInf 0.0000000e+00               825s

  708357 PPushes remaining with PInf 1.0329409e-03               826s
  637301 PPushes remaining with PInf 0.0000000e+00               830s
  558277 PPushes remaining with PInf 0.0000000e+00               836s
  474805 PPushes remaining with PInf 0.0000000e+00               840s
  322196 PPushes remaining with PInf 0.0000000e+00               845s
  185897 PPushes remaining with PInf 0.0000000e+00               850s
   44324 PPushes remaining with PInf 0.0000000e+00               855s
    1108 PPushes remaining with PInf 0.0000000e+00               862s
       0 PPushes remaining with PInf 0.0000000e+00               862s

  Push phase complete: Pinf 0.0000000e+00, Dinf 1.0490706e-10    862s


Solved with barrier
Iteration    Objective       Primal Inf.    Dual Inf.      Time
  715481    3.8049458e+03   0.000000e+00   0.000000e+00    865s

Solved in 715481 iterations and 864.92 seconds (470.05 work units)
Optimal objective  3.804945806e+03

User-callback calls 718074, time in user-callback 0.45 sec

julia> JuMP.solution_summary(lpr)
solution_summary(; result = 1, verbose = false)
├ solver_name          : Gurobi
├ Termination
│ ├ termination_status : OPTIMAL
│ ├ result_count       : 1
│ ├ raw_status         : Model was solved to optimality (subject to tolerances), and an optimal solution is available.
│ └ objective_bound    : 3.80495e+03
├ Solution (result = 1)
│ ├ primal_status        : FEASIBLE_POINT
│ ├ dual_status          : FEASIBLE_POINT
│ ├ objective_value      : 3.80495e+03
│ └ dual_objective_value : 3.80495e+03
└ Work counters
  ├ solve_time (sec)   : 8.64936e+02
  ├ simplex_iterations : 715481
  ├ barrier_iterations : 34
  └ node_count         : 0

And a question for me now is:
is there a way that I can add the 1 Million constraints in a lazy fashion? (note that this is an LP)

There is a way through callbacks: Solver-independent Callbacks · JuMP

Yes, I know there is a callback for MIP. But I’m not sure is there a callback method for a pure LP.
It appears that Gurobi doesn’t support pure-LP callback (is it?)

1 Like

Oh, good point. I don’t know actually! I had always assumed it did, but never tried it.

I find Lazy and LazyConstraints in Gurobi’s doc.
The former can be used outside of callback, while the latter seems to be the opposite. The disquieting fact is that both are with note Only affects MIP models, whereas mine is a pure LP.

For my specific application here, I guess I need the Lazy attribute—Maybe I should set all that 1 million constraints to Lazy with value being 1.

Since mine is an LP, maybe I need to add a dummy_decision to promote it to a MIP, like this

import JuMP, Gurobi; GRB_ENV = Gurobi.Env();
let
    model = JuMP.direct_model(Gurobi.Optimizer(GRB_ENV))
    JuMP.@variable(model, x)
    dummy_decision = JuMP.@variable(model, binary = true, upper_bound = 0)
    JuMP.@constraint(model, c[i = 1:987], x >= log(i));
    JuMP.MOI.set.(JuMP.backend(model), Gurobi.ConstraintAttribute("Lazy"), JuMP.index.(c), 1);
    JuMP.@objective(model, Min, x)
    JuMP.set_attribute(model, "PreSolve", 0) # To make the evidence 🍅 visible
    JuMP.optimize!(model) # see the following Logging
end
Set parameter Presolve to value 0
Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (win64 - Windows 11.0 (26100.2))

CPU model: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, instruction set [SSE2|AVX|AVX2|AVX512]
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Non-default parameters:
Presolve  0

Optimize a model with 987 rows, 2 columns and 987 nonzeros
Model fingerprint: 0xaaaf0af2
Variable types: 1 continuous, 1 integer (1 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [7e-01, 7e+00]
Found heuristic solution: objective 6.8946700
Extracted 987 lazy constraints # 🍅
Variable types: 1 continuous, 1 integer (1 binary)

Explored 0 nodes (0 simplex iterations) in 0.00 seconds (0.00 work units)
Thread count was 8 (of 8 available processors)

Solution count 1: 6.89467 

Optimal solution found (tolerance 1.00e-04)
Best objective 6.894670039415e+00, best bound 6.894670039415e+00, gap 0.0000%

User-callback calls 34, time in user-callback 0.00 sec

It’s still strange. Why does Gurobi do not support lazy cuts in pure LP?

This example is adapted from Gurobi.jl, do you think the example there is proper? @odow , since your model there is an LP, which is supposed to be immune to the Lazy attribute setting.

Gurobi does not support lazy constraints in an LP.

You can use an iterative cutting plane method, where you iteratively call optimize! and then @constraint. Gurobi will automatically warm-start the problem from the previous solution.

This example is adapted from Gurobi.jl, do you think the example there is proper?

It just demonstrates how to set the variable, constraint, and model attributes. It isn’t intended to be a proper example.

1 Like

Here is one more question. I have 2 vectors: r, c and one asymmetric matrix X.
r is a 101-element SparseArrays.SparseVector{Float64, Int64} with 101 stored entries:
c is a 170-element SparseArrays.SparseVector{Float64, Int64} with 3 stored entries:
X is a 101×170 Matrix{JuMP.AffExpr}:
I now need to write the constraint @constraint(model, r' * X * c >= 0),
Which style should I choose, to maximize the performance (assume I need to add 1 million of this type of constraint, assume 101 and 170 both increase)

  1. @constraint(model, transpose(r) * X * c >= 0)
  2. @constraint(model, transpose(r) * (X * c) >= 0)
  3. @constraint(model, dot(r, X, c) >= 0)
  4. @constraint(model, tr(c * transpose(r) * X) >= 0)
  5. @constraint(model, sm(r * transpose(c), X) >= 0), where sm is here.
  6. should I use transpose() or ' (adjoint)?

And this list can be enlarged because there are many equivalent forms.

It takes about 4 seconds on my laptop to add 7k constraints with JuMP.
is this slow?

for r in eachrow(A1)                                                                                           
       ite += 1                                                                                                   
       println("ite = $ite <= 201, time_elapsed = $(time() - t0)")                                                
       JuMP.@constraint(lpr, [c in eachrow(A2)], transpose(r) * cross_multrix * c >= 0)                           
end
ite = 1 <= 201, time_elapsed = 0.01900005340576172
ite = 2 <= 201, time_elapsed = 14.52999997138977 # the first one is a bit denser, taking 14 seconds
ite = 3 <= 201, time_elapsed = 18.377000093460083
ite = 4 <= 201, time_elapsed = 22.23099994659424
ite = 5 <= 201, time_elapsed = 26.15400004386902

To reproduce this procedure, please see code in this post.

Have you benchmarked the different forms? I would assume this depends on the sparsity of r and c.

I would do:

using JuMP, SparseArrays
model = Model()
@variable(model, x[1:101, 1:170])
X = 1.0 .* x
r = SparseArrays.sprand(101, 0.1)
c = SparseArrays.sprand(170, 0.1)
# @constraint(model, r' * X * c >= 0)
(r_I, r_V), (c_I, c_V)  = SparseArrays.findnz(r), SparseArrays.findnz(c)
@constraint(
    model, 
    sum(
        r_v * X[r_i, c_i] * c_v for
        (c_i, c_v) in zip(c_I, c_V), (r_i, r_v) in zip(r_I, r_V)
    ) >= 0
)
1 Like

Yes, this zip used is very proper. We have to manually write this procedure because LinearAlgebra doesn’t optimize the code for SparseArrays. With your style, I add 1434738 constraints within 12.5 seconds.

ite = 0
t0 = time()
for i in eachindex(eachrow(A1))
    ite += 1
    println("ite = $ite <= 201, time_elapsed = $(time() - t0)")
    for j in eachindex(eachrow(A2))
        r, c = A1[i, :], A2[j, :]
        JuMP.@constraint(lpr,
            sum(    rv * cv * cross_multrix[ri, ci] for
                (ri, rv) in zip(r.nzind, r.nzval), 
                (ci, cv) in zip(c.nzind, c.nzval)
            )   >= 0
        )
    end
end # ite = 201 <= 201, time_elapsed = 12.47000002861023

I made a slight modification to make it looks better. :slightly_smiling_face:

I’m afraid that there will be a nontrivial time overhead on the interaction of Gurobi and JuMP, if we alternately perform this. The odds are that it could be inferior than delivering all constraints to Gurobi all at once (in my application, the set of constraints are explict and finitely-many). (my guess).

You would typically add only the constraints that are violated in each iteration. Assuming only a small fraction of the constraints are ever relevant, the iterative approach is faster.

But yes, the true performance is problem dependent. You’ll have to experiment with your particular problem.

I think Lazy is somewhat pointless, compared with LazyConstraints.
The latter is useful when we have infinitely many (implicit) cuts, e.g. emulating a SDP constraint (see the while-loop in this example).
If we have finitely many explicit cuts (e.g., in the context of this topic), then we have 2 possibilities

  1. the number of these cuts are small, e.g. 10 cuts
  2. the number of these cuts are huge, e.g. 1 million

It’s manifest that it would be pointless to use Lazy for merely 10 cuts.
Now suppose we submit 1 million explicit cuts to Gurobi, setting the Lazy attribute. What’s its meaning?
Shouldn’t let gurobi decide which part of constraints should be deemed “lazy” to accelerate the solution procedure. After all, it is a black-box solver. It’s weird.

This doesn’t make sense either. Suppose I have 1 million explicit lazy constraints. Then the solution procedure should be like this

Gurobi's_optimize!(model)
x_val = value(x)
for i in 1:1000000
    if X_val violates the constr[i]
        re_add(constr[i])
    end
end
# do the above procedure again

You see that at every “callback”, we ought to loop over the huge collection of lazy constraints.
It’s not necessarily faster that the pure static method (without any “callback” procedure like the above code block).

In most cases you can construct a clever oracle that can generate the set of violated cuts more efficiently than enumerating them.

Something like this:

# min y st: y >= f(x), where f(x) := (x - 1)^2
using JuMP, Gurobi
model = Model(Gurobi.Optimizer)
@variable(model, x)
@variable(model, y >= 0)
@objective(model, Min, y)
f(x) = (x - 1)^2
∇f(x) = 2 * (x - 1)
while true
    optimize!(model)
    assert_is_solved_and_feasible(model)
    xv, yv = value(x), value(y)
    if yv >= f(xv) - 1e-6
        break
    end
    @constraint(model, y >= f(xv) + ∇f(xv) * (x - xv))
end