Gurobi lazy constraint issue

Hi,

I’m implementing a MIP cut generation algorithm on JuMP+Gurobi using MOI.LazyConstraintCallback.

Theoretically, the algorithm is correct. However, running tests on a large batch of instances and cross-checking with another pure MIP, 1 out of the 500 instances yields a slightly wrong answer.

I’ve tested by hand and my check does detect the optimal integer point as infeasible, the constraint also cuts it off. But upon closer inspection, it appears that Gurobi accepts it as incumbent without giving me the opportunity to reject it. The call-back function is in fact not called at all.

A particularity(?) of this issue is that this incumbent is within 0.01% of the LB.

I’ve tried to produce a MWE to no avail, but I’d gladly share the source code in private if necessary.

A portion of the log :

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

[...]

10624   940     cutoff   40       270.35700  270.32800  0.01%  17.7   20s
 17174   989  270.32800   55    1  270.35700  270.32800  0.01%  15.5   25s
 23761  1275     cutoff   41       270.35700  270.32800  0.01%  14.6   30s
 30166  1346  270.32800   44    2  270.35700  270.32800  0.01%  14.0   35s
 36625  1444  270.32800   31    7  270.35700  270.32800  0.01%  13.7   40s
 43289  1411  270.32800   49    3  270.35700  270.32800  0.01%  13.4   45s
 50075  1464  270.32800   36    2  270.35700  270.32800  0.01%  13.2   50s
*51325     2              38     270.3280000  270.32800  0.00%  13.1   50s

The optimal objective value should be 270.357.

Based on this @miles.lubin reply

Gurobi could ignore the lazy constraint if it’s satisfied within Gurobi’s tolerances.

from this thread, the program would’ve ended earlier if it were a tolerance issue.

Do you have any insights?

Many thanks!

Hi @marouane-f, welcome to the forum.

It’s hard to say anything without a reproducible example. This might be a bug in Gurobi, or it might be a bug in your code.

Have you verified that the final solution does indeed violate the lazy constraint that you would have added by more than the feasibility tolerance?

You should contact Gurobi support: https://support.gurobi.com/hc/en-us. They will be able to help you, and they should be interested in this case since, assuming things are okay in your code, it does look like a bug where the lazy constraint is not called.

cc @torressa may be able to help

2 Likes

Hi @odow, thanks!

The final solution does violate my lazy constraint by more than the feasibility tolerance (it’s a knapsack constraint with integer coefficients/RHS).

The issue is that the solution of objective value 270.328 is accepted as incumbent and the program ends without ever entering the check in the callback.

I will reach out to Gurobi support and keep this thread updated.

Thank you for your reply,

1 Like

Update #1

Disabling solver cuts solves the issue at the expense of runtime (and using all threads as opposed to only 1 before)

After ~100s, the LB is correctly updated to 270.357.

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

[...]

 555629   155  270.32800   63    1  270.35700  270.32800  0.01%  10.3  110s

Optimal solution found (tolerance 1.00e-04)
Best objective 2.703570000000e+02, best bound 2.703570000000e+02, gap 0.0000%




Update #2

It has nothing to do with the cuts, but with the number of threads. Running (n>1)-threaded, with or without solver cuts yields a correct solution, but (weirdly) at ~10x more runtime.

1 Like

Are you setting the LazyConstraints parameter to 1?
This is shown in the Gurobi.jl Callbacks section

The solver independent LazyConstraintCallback does this automatically.

1 Like

The solver independent LazyConstraintCallback does this automatically.

Thanks! I didn’t know this.

@marouane-f Can you share a snippet of the callback you are using?
Are you adding the Lazy cuts when the node status is CALLBACK_NODE_STATUS_INTEGER?
Also, you may want to try setting the PreCrush parameter to 1.

The lazy constraints get called at both mipnode and mipsol.

Yes, I think that is fine. But I guess it is up to the users whether they add lazy cuts when CALLBACK_NODE_STATUS_INTEGER (MIPSOL) or elsewhere.

1 Like

Hi @torressa, thank you for your reply.

As @odow pointed out, both PreCrush and LazyConstraints are set to 1.

Yes, I’m only adding the lazy cuts at integer nodes. The issue persists using fractional nodes as well.

Below is a concise snippet,

function lazy_constraint_callback(cb_data)
    status = callback_node_status(cb_data, model)
     
    # [...] some preproc

    if status == MOI.CALLBACK_NODE_STATUS_INTEGER
        vars = callback_value.(cb_data, ...)
        my_data = ... # proc some data
        if some_condition_on(vars)
            for _ in my_data
                constr = @build_constraint(...) # clique inequality`

                MOI.submit(model, MOI.LazyConstraint(cb_data), constr)
            end
        end
        
    end 
end

The program does evaluate multiple (infeasible) nodes whose objective value is 270.328 and adds the necessary cut, except for the last one.

__

I was wrong in saying that

… the call-back function is in fact not called at all .

I did some more debugging and it turns out that some_condition_on() is not verified at the node at which the program ends (it should hold). I retrieved it using if obj_value == 270.328 && !some_condition_on(var_at_last_cb)

What I cannot make sense of now is that the variable values
var_at_last_cb = callback_value.(cb_data, some_var) at the supposedly optimal node differ from the values of
var_at_opt = value.(some_var) provided after reaching optimality and exiting the program.

And obviously, some_condition_on(var_at_opt) holds while some_condition_on(var_at_last_cb) does not.

Am I missing something here?

I will continue doing more checks and keep this updated. Thanks a lot!

I have seen this before due to many non-violated lazy cuts being added in every node.
It suffices to add a single lazy cut that cuts off the current solution. You can also add them for non-integer solutions.

Are you checking that the solution is violated by any of the lazy cuts you are adding?

To help debug this, for a small failing instance, could you write the model to a file (I think you can attach it here) and create a list of lazy constraints added in an txt file that we can use? e.g.

x1 + ... < RHS
x2 + ... < RHS2

That way we don’t need to use your callback function but just read the cuts from the txt file and then use a simpler callback to add one of these in turn once when they are violated.

Whilst doing that, I think I figured it out. It’s a numerical precision issue.

To make it simple, some_condition_on(vars) boils down to a geq expression g(x) >= N for some g and N.

Running it single-threaded, it happens that at that node deep in the tree, and for N=10, the retrieved value in the callback is (literally) g(x) = 9.999999999999999.

That’s why the condition does not hold and the program exits. This is not encountered for any (n>1)-threaded runs as mentioned previously.

My model is a MIP with cont. and 0-1 vars so I believe numerical issues are bound to happen… but is this normal behavior with regards to the number of threads?

Thank you,

1 Like

To complete my answer, I only add violated lazy cuts constraints. But I would like to know what makes adding many non-violated lazy cuts problematic?

From my understanding, Gurobi stores constraints in a pool and adds them when they become active.

Is it a memory management issue since it has to check them at every possible incumbent?

What kind of consequences does this have?

Many thanks again!

Running it single-threaded, it happens that at that node deep in the tree, and for N=10, the retrieved value in the callback is (literally) g(x)=9.999999999999999.

This is within the default feasibility and integer tolerance (FeasibilityTol and IntFeasTol resp). You can add a tolerance to your checks, or tighten them.
See also Why does Gurobi sometimes return non-integral values for integer variables?

My model is a MIP with cont. and 0-1 vars so I believe numerical issues are bound to happen… but is this normal behavior with regards to the number of threads?

The solution path will change with a different number of threads (starting from presolve), which in turn may lead to seeing different solutions in the callback (ultimately, the optimal solution is the same).

To complete my answer, I only add violated lazy cuts constraints. But I would like to know what makes adding many non-violated lazy cuts problematic?

As long you are always adding at least one constraint that is violated it is fine. As you say they are stored in a pool and added if they are violated (and not already present or dominated by one of our standard cuts). Other than saving time in the callback, it’s just a good recommendation for debugging as it can be hard to see what’s going on otherwise.

1 Like

I see!

I am still trying to understand why the numerical precision issue only appears on single-threaded runs.

I also checked over a few runs the variable values retrieved in the callbacks, and they are consistently strictly integer except for the 1-threaded one.

I went through this Gurobi webinar on Parallelism in LP/MIP but I fail to infer an explanation. Do you have an idea?

Thank you for the insights.

I am still trying to understand why the numerical precision issue only appears on single-threaded runs.

You should never make any assumptions about when the solver might return 9.99999999 instead of 10. It may happen at any time, and it may depend on the random seed, order of variables and constraints, etc.

When checking feasibility, you must always use a tolerance. See, for example, the 1e-6 in the callback documentation: Solver-independent Callbacks · JuMP

1 Like