Why doesn't the solver reach the optimal solution?

HenriDeh · June 29, 2026, 4:46pm

Hello, I am working on improving a formulation of an optimization problem with JuMP and have encountered an issue that has been blocking me for several days. After extensive debugging and manual verification, I am reasonably confident that the formulation itself is correct. I now suspect the issue may be numerical, but I am unsure.

Below is the REPL output from three runs:

“Without DD”: the base formulation, which finds the optimal solution.
“With DD”: the formulation augmented with additional constraints intended to reduce the search space.
“With DD (optimal w start values)”: same as above, but initialized with optimal values for a subset of variables w, obtained from the “without DD” run.

In the first “with DD” run, the solver finds a worse solution (this is a maximization problem), which could suggest that the added constraints are invalid. However, when providing the optimal w values as a warm start, the solver reaches a better solution. This seems to rule out a feasibility issue.

My current suspicion is that the issue is related to the lazy constraints callback. The added constraints are indicator constraints modeled using a big-M formulation:

z \leq c^\top y^{(j)} + (1-\lambda_j) M, \forall j
d(w)^\top y \geq d(w)^\top y^{(j)} + \epsilon - \lambda_j M, \forall j

Here, d(w)^\top y is linearized internally, and z is the epigraph variable for the objective. During branch-and-bound, the callback identifies solutions y^{(j)} that fail a robustness check at integer nodes and adds these constraints in a Kelley cutting-plane style.

The binary variables $\lambda$ act as indicators: they activate one constraint or the other depending on whether a strict inequality condition is satisfied.

Given the use of both a large M and a small \epsilon (10^{-4}), I suspect that numerical issues related to this formulation may be causing the behavior I observe, but Gurobi’s output is not flagging any.

Any suggestions on how to diagnose or mitigate this would be greatly appreciated. I am running out of ideas.

=== Without DD ===
Set parameter OutputFlag to value 1
Predicted feature coefficients: [0.33, 0.66, 0.05, -0.04, 0.64]
Set parameter OutputFlag to value 1
Set parameter LazyConstraints to value 1
Gurobi Optimizer version 13.0.2 build v13.0.2rc1 (win64 - Windows 11+.0 (26200.2))

CPU model: Intel(R) Core(TM) Ultra 7 155U, instruction set [SSE2|AVX|AVX2]
Thread count: 12 physical cores, 14 logical processors, using up to 14 threads

Non-default parameters:
TimeLimit  30
LazyConstraints  1

Optimize a model with 195 rows, 395 columns and 930 nonzeros (Max)
Model fingerprint: 0x2f84e83d
Model has 35 linear objective coefficients
Variable types: 40 continuous, 355 integer (355 binary)
Coefficient statistics:
 Matrix range     [5e-03, 6e+00]
 Objective range  [2e-02, 3e-01]
 Bounds range     [8e-01, 3e+00]
 RHS range        [3e+00, 7e+00]

Presolve time: 0.00s
Presolved: 195 rows, 395 columns, 930 nonzeros
Variable types: 40 continuous, 355 integer (355 binary)
Found heuristic solution: objective 2.4156729
Found heuristic solution: objective 3.2507686

Root relaxation: objective 3.883178e+00, 72 iterations, 0.00 seconds (0.00 work units)

   Nodes    |    Current Node    |     Objective Bounds      |     Work
Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

    0     0    3.88318    0    3    3.25077    3.88318  19.5%     -    0s
H    0     0                       3.5070046    3.88318  10.7%     -    0s
H    0     0                       3.5862280    3.88318  8.28%     -    0s
    0     0    3.77162    0    3    3.58623    3.77162  5.17%     -    0s
    0     0    3.62513    0    5    3.58623    3.62513  1.08%     -    0s
    0     0    3.60859    0    3    3.58623    3.60859  0.62%     -    0s
H    0     0                       3.6085581    3.60859  0.00%     -    0s

Cutting planes:
 Cover: 3

Explored 1 nodes (105 simplex iterations) in 0.05 seconds (0.01 work units)
Thread count was 14 (of 14 available processors)

Solution count 5: 3.60856 3.58623 3.507 ... 2.41567

Optimal solution found (tolerance 1.00e-04)
Best objective 3.608558110543e+00, best bound 3.608592531863e+00, gap 0.0010%

User-callback calls 169, time in user-callback 0.04 sec
Finished in 0.088021 seconds

=== With DD, no start_value ===
Set parameter OutputFlag to value 1
Set parameter OutputFlag to value 1
Set parameter LazyConstraints to value 1
Gurobi Optimizer version 13.0.2 build v13.0.2rc1 (win64 - Windows 11+.0 (26200.2))

CPU model: Intel(R) Core(TM) Ultra 7 155U, instruction set [SSE2|AVX|AVX2]
Thread count: 12 physical cores, 14 logical processors, using up to 14 threads

Non-default parameters:
TimeLimit  30
LazyConstraints  1

Optimize a model with 416 rows, 225 columns and 1847 nonzeros (Max)
Model fingerprint: 0x13de0b8e
Model has 35 linear objective coefficients
Variable types: 165 continuous, 60 integer (60 binary)
Coefficient statistics:
 Matrix range     [5e-03, 6e+00]
 Objective range  [2e-02, 3e-01]
 Bounds range     [8e-01, 3e+00]
 RHS range        [3e+00, 7e+00]

Presolve removed 36 rows and 5 columns
Presolve time: 0.00s
Presolved: 380 rows, 220 columns, 1758 nonzeros
Variable types: 160 continuous, 60 integer (60 binary)

Root relaxation: objective 3.011084e+00, 236 iterations, 0.00 seconds (0.00 work units)

   Nodes    |    Current Node    |     Objective Bounds      |     Work
Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

    0     0    3.01108    0   10          -    3.01108      -     -    0s
    0     0    3.00233    0   15          -    3.00233      -     -    0s
H    0     0                       2.7511007    3.00099  9.08%     -    0s
    0     0    3.00099    0   16    2.75110    3.00099  9.08%     -    0s
    0     0    3.00099    0   15    2.75110    3.00099  9.08%     -    0s
    0     0    3.00043    0   16    2.75110    3.00043  9.06%     -    0s
    0     0    2.99906    0   18    2.75110    2.99906  9.01%     -    0s
    0     0    2.99865    0   20    2.75110    2.99865  9.00%     -    0s
    0     0    2.99858    0   17    2.75110    2.99858  9.00%     -    0s
    0     0    2.99831    0   20    2.75110    2.99831  8.99%     -    0s
    0     0    2.99798    0   20    2.75110    2.99798  8.97%     -    0s
    0     0    2.99796    0   20    2.75110    2.99796  8.97%     -    0s
    0     2    2.99796    0   20    2.75110    2.99796  8.97%     -    0s
H   29    35                       2.8368077    2.92023  2.94%  20.7    0s
*   72     6              12       2.8764041    2.88488  0.29%  12.8    0s

Cutting planes:
 Gomory: 3
 Cover: 10
 MIR: 5
 RLT: 1

Explored 113 nodes (1427 simplex iterations) in 0.13 seconds (0.07 work units)
Thread count was 14 (of 14 available processors)

Solution count 3: 2.8764 2.83681 2.7511 

Optimal solution found (tolerance 1.00e-04)
Best objective 2.876404113481e+00, best bound 2.876404113481e+00, gap 0.0000%

User-callback calls 512, time in user-callback 0.03 sec
Finished in 0.1344096 seconds

=== With DD, optimal w start_value ===
Set parameter OutputFlag to value 1
Set parameter OutputFlag to value 1
Set parameter LazyConstraints to value 1
Gurobi Optimizer version 13.0.2 build v13.0.2rc1 (win64 - Windows 11+.0 (26200.2))

CPU model: Intel(R) Core(TM) Ultra 7 155U, instruction set [SSE2|AVX|AVX2]
Thread count: 12 physical cores, 14 logical processors, using up to 14 threads

Non-default parameters:
TimeLimit  30
LazyConstraints  1

Optimize a model with 416 rows, 225 columns and 1847 nonzeros (Max)
Model fingerprint: 0x41b88368
Model has 35 linear objective coefficients
Variable types: 165 continuous, 60 integer (60 binary)
Coefficient statistics:
 Matrix range     [5e-03, 6e+00]
 Objective range  [2e-02, 3e-01]
 Bounds range     [8e-01, 3e+00]
 RHS range        [3e+00, 7e+00]

User MIP start produced solution with objective 3.69862 (0.35s)
Loaded user MIP start with objective 3.69862

Presolve removed 36 rows and 5 columns
Presolve time: 0.00s
Presolved: 380 rows, 220 columns, 1758 nonzeros
Variable types: 160 continuous, 60 integer (60 binary)

Root relaxation: objective 3.883014e+00, 229 iterations, 0.00 seconds (0.00 work units)

   Nodes    |    Current Node    |     Objective Bounds      |     Work
Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

    0     0    3.88301    0    3    3.69862    3.88301  4.99%     -    0s
    0     0    3.73417    0    3    3.69862    3.73417  0.96%     -    0s
    0     0    3.71520    0    2    3.69862    3.71520  0.45%     -    0s

Cutting planes:
 Cover: 2
 RLT: 1

Explored 1 nodes (240 simplex iterations) in 0.36 seconds (0.01 work units)
Thread count was 14 (of 14 available processors)

Solution count 1: 3.69862 

Optimal solution found (tolerance 1.00e-04)
Best objective 3.698624979138e+00, best bound 3.698659508269e+00, gap 0.0009%

User-callback calls 212, time in user-callback 0.35 sec
Finished in 0.7573941 seconds

langestefan · June 29, 2026, 5:03pm

I don’t know why, but you’re definitely solving different models in each case.

HenriDeh · June 29, 2026, 5:40pm

The first fingerprint should be different from the second and third ones, right ?

But yes, the only difference between 2 and 3 is whether this line is commented out :
JuMP.set_start_value.(master_model[:w], collect(Main.w_opt))

I also just noticed that the objective value of run 3 is in fact higher than the base model.

I’ll add that the algorithm used to converge to the same solution with or without the DD cuts when it was implemented as an iterative algorithm, where the lazy constraints were added after solving the master problem using the indicator syntax $\lambda --> {const}, as opposed to the current callback implementation.

HenriDeh · June 29, 2026, 7:17pm

I’m getting truly mad here. Here are some additional tests I made

Solving the linear relaxation of the two models without the callback (that’s the only change: no callback is added to the model and integrality is relaxed) gives the same objective value for both. The same as the no-DD root relaxation (3.88…).

So I figured that perhaps the presence of the callback changes something at the root node presolve. Yet, when the callback is present and the Presolve and Cuts attributes of Gurobi are set to 0, the root relaxations no longer match: 3.88 for the no-DD model, 3.01 for the DD model.

This doesn’t make any sense to me. The presence of a callback shouldn’t change the root relaxation value. Right ? Unless the callback is called at the root node? The DD constraints do make the root relaxation integer.

sandyspiers · June 30, 2026, 12:20am

The following advice is only from my experience (I think I remember reading in the docs, but I havent gone back to find)

With LazyConstraints=1 (automatically set when you add a callback, you can see it in the log), Gurobi will avoid making some precrushes at the root in case you need those variables later in your callback. Also, the callback can be called at or before the root if the heuristic finds a solution. So yes, a callback can change the reported root relaxation value if the problem is a MIP.

Have your tried reducing the feasibility tolerance? This would help in determining if the warm start is actually feasible with the extra constraints (and hence if those constraints are valid).

HenriDeh · June 30, 2026, 6:01pm

Hi, thank you for replying.

While initially replying to you I tried to increase the big M value and it turned out it was too tight.
I thought I had it set to a tight but valid value, I was wrong it seems.

odow · June 30, 2026, 10:41pm

Hi all, you seem to have solved this without me

Yes, lazy constraints + big-M are easy to get wrong. There can be bugs in Gurobi, but the most likely answer is always that either there is a bug in your callback, or your model isn’t modelling what you think it’s modelling.

I’ll also drop a link to this part of the docs: Tolerances and numerical issues · JuMP

HenriDeh · July 1, 2026, 3:14am

Yes, it was sneaky to detect because injecting the known optimal feasible solution as a warm start would not trigger the callback since, well, the solution was feasible. The big M being too small would thus go undetected.

Topic		Replies	Views
I added 1 Million constraints to an LP unwittingly Optimization (Mathematical) question , jump , examples , gurobi , experience	36	1082	April 13, 2025
Gurobi lazy constraint issue Optimization (Mathematical) gurobi , lazy-constraint	15	791	May 11, 2024
Benders Decomposition: Callback function to be called when Master problem reaches optimal at that iteration (in JuMP's example, it is called@ any feasible sub-optimal integer point) Optimization (Mathematical) question	10	3157	March 30, 2017
Warning, invalid warm-start basis discarded Optimization (Mathematical) question , jump	18	4045	March 19, 2020
Multiple solve, same script Optimization (Mathematical) jump , gurobi	10	600	November 7, 2023

Why doesn't the solver reach the optimal solution?

Related topics