Force model to discard basis

I am wondering if there is a JuMP/MOI way (this is, solver-agnostic) to discard the simplex basis that is implicitly carried by a Model after an LP was solved by the first time. My problem is that I have a large model that strongly benefits from presolve and I have to solve it many times with small changes, the first solve (without any basis loaded) runs fast, but the subsequent solves get much much slower because now that it has a basis the solver behavior (common to many solvers) is to disable presolver.

What changes? If it is just RHS or objective, solvers should be able to efficiently start. If you change the constraint matrix, the basis is usually invalidated anyway and the solver has to refactorize things.

Not a great solution, but if the first run is fast, just rebuild the model from scratch every time? Otherwise is it a difference of the underlying algorithm? You could force the solver to always use barrier, for example.

I unfix some variables each iteration. I am sure the first time, that is fast, does not have a basis because I see the presolve log, and I am sure the other runs reuse the basis because they do not show the presolve log anymore (and no configuration was changed in the meantime), and both basis and presolve are mutually exclusive in Gurobi. So I am sure this change both: does not invalidate the basis; does not allow the solver to “efficiently start” (that I am not sure it is possible in Gurobi, at least the manual seems to indicate that any basis will immediately disable the presolve).

I cannot rebuild the model because this would take more time and I have no way to rebuild it with exactly the variables I want (except if I build it with all of them and find which ones to delete each iteration, what is a very clunky process).

In models with a little less variables, in which the disabled presolve has not impacted me so much, changing the algorithm to barrier makes a >60 seconds run hit the 1 hour time limit, because these runs benefited from the basis start. I literally have a latex table with those times because I presented it to my advisor last week.

The only ideal solution would be to have basis and presolve simultaneously but this is a solver limitation.

I think I will have to throw out some hundreds of lines of code and make the code in a way that it grows the model instead of creating it and then solving a subset of the columns. This will probably create some serious indexing problems for me and probably make me lose a week in rework.

You could try: https://www.gurobi.com/documentation/9.0/refman/c_reset.html

Yes. And I will, probably, to test the impact of the presolve of some solutions before talking to my advisor, however: (1) this needs me to jump a few extra hops, because my code tries to be solver-agnostic; (2) I was not sure if this was implemented in Gurobi.jl as the API is “implemented if needed”, but if you are suggesting it, I suppose it is implemented. Ultimately and unfortunately, however, I think I will not escape a large code rewriting if I intend to provide a honest reimplementation of the method I am comparing with.

Theoretically, passing the JuMP.backend of a JuMP.direct_model to this undocumented Gurobi.reset_model! should do the trick, I suppose? Well, I think it will become obvious very fast if it ends up as a “deletes all variables and constraints” kind of reset, :sweat_smile:.

1 Like

Would it be possible to share some successive logs?

If I understand correctly, you’re initially building a model, solving it, then un-fixing some variables? Does the latter correspond to, e.g., changing a variable’s bounds from 0 <= x <= 0 to 0 <= x <= 1?

More than time, the number of simplex iterations would be a metric to look at.

Several things could happen:

  • Some of the presolve reductions may be invalidated by your change of bounds.
    For instance, if you had a constraint x1 + x2 == 1 plus 0 <= x1 <= 0, then x2 would get fixed to 1 during presolve. Un-fixing x1 means you have to un-fix x2 as well. Not sure how solvers handle that.
  • If all the modifications you’re doing are relaxing variable bounds / adding variables (in a column-generation fashion), make sure you use primal simplex (cf the Method parameter); Gurobi’s default is to use dual simplex.
    You can view this by looking at the log: if the primal bound goes down, you’re doing primal simplex, and if the dual bound goes up, it’s dual simplex.
  • If you are deleting variables… this may be slow in itself anyway
  • It could also be that the subsequent model is just harder than the initial one.
    One way to check would be to export the first and subsequent models (you can use Gurobi.write_model), then solve all of them from scratch.
3 Likes

Thank you both odow and mtanneau for your attention. I have written, discarded, and rewritten this post because the maddening spiral of debug. This post is already long, but is considerably smaller than its previous version. I am posting it because there is a little change it gives some insight to someone hitting the same problem.

If I understand correctly […]

You understood correctly. I use JuMP.fix(var, 0.0; force = true) and after I unfix them and use set{lower|upper}bound to set them back to their old values.

Some of the presolve reductions may be invalidated by your change of bounds.

For what I see in the logs and what is explained by Gurobi manual, no presolve phase happens after the first LP solve because there is an implicit basis being given.

If all the modifications you’re doing are relaxing variable bounds / adding variables (in a column-generation fashion), make sure you use primal simplex (cf the Method parameter); Gurobi’s default is to use dual simplex.

I tried it with… terrible results. I can provide times/logs if someone is interested but the takeover seem to be: my problem/formulation has a considerable primal degeneracy problem, in the sense that trying to solve the integer relaxation problem caused messages like: ‘introducing perturbation’, ‘switching to devex’, ‘switching to steepest edge’, and had time at least an order of magnitude worse than dual simplex.

If you are deleting variables… this may be slow in itself anyway

In this step (a successive solution of growing LPs), I am not deleting variables. However, if not much of a bother, I would like you to expand on why it would be slow. I thought my PRs to Gurobi.jl had solved most slowness problems with delete.

It could also be that the subsequent model is just harder than the initial one.

This is kinda of a fact. I say ‘kinda of a fact’ because, yes, every “new” LP has all not-fixed-to-zero variables of the “last” LP and some of the previously fixed-to-zero variables are now not-fixed-to-zero (often with bounds like 0 <= x <= integer between 1 and 10). However, alone it does not explain my problems, I will explain more below.

I have to reproduce a method from the literature that has a phase where it starts with a restricted LP (that does not solve a relaxation of my problem, but a relaxation of restricted version of it) and grows it into a LP that actually solves the relaxation of my problem (every time I say relaxation is just integer/continuous relaxation).

Because it is much more convenient, the way I used to do this is: generating the full model (that I kinda need it anyway), fix all variables outside of the restricted problem to zero, and do the “growing LP model” by computing which variables I should unfix next (using the dual of the constraints obtained solving the current LP).

So, in the original paper, in 2016, they solve an instance of my problem in 17 seconds, that I solved it in 80 seconds. A lot of things happened, I thought I discovered a bug in my implementation and ‘fixed it’, this reduced the initial number of variables of the LP model by a whole lot. However, the times gone from 80 seconds to 11 minutes in that instance. In the end, I was wrong about the bug, the model should be started with a whole lot more of variables that I thought necessary, and so I was doing almost the right thing before the “fix”. What I needed to correct was initial number of variables in a single LP solve followed by single MIP solve, before the whole LP growing thing. However, to solve the ‘basis disable presolve’ problem I actually rewrote a good deal of the code and made it actually start with a subset of the variables and grew the model by adding them (not just unfixing). This reduced the time from 80 seconds to 24~28s.

If you are deleting variables… this may be slow in itself anyway

This was meant as a general observation :slight_smile: The worst case scenario is if all variables are not deleted at once, and other modifications happen in-between subsequent variable deletion. That would likely force an update of the model, but that’s not the case here.
In any case, when a variable is deleted (not just force to zero), this may force the solver to discard the current basis, and possibly re-start from scratch. As a rule of thumb, changes to bounds/objective/right-hand side are OK, changing the matrix can be dangerous.

they solve an instance of my problem in 17 seconds, that I solved it in 80 seconds 24~28s.

TBH, given the difference in implementation, machine and solver (they use CPLEX in that paper) I would not say that a 4x time difference is significant. You can probably achieve that much performance variation just by changing the solver’s seed, especially if you have very degenerate problems. I’ve seen some ~20% performance variation (for pure LP) just depending on whether other jobs are running on the machine!

the takeover seem to be: my problem/formulation has a considerable primal degeneracy

This will be relevant only if you’re just solving LPs and don’t need the basis information: the barrier algorithm is much better against degenerate problems than simplex. That can hold even if the simplex is warm-started (I have some column-generation instances where a cold-start barrier is much faster than warm-start simplex). If you don’t need the basis, switch off crossover.
BTW, if you are adding/unfixing many variables, say, > few hundreds, the previous basis may not be that good anyway. That makes barrier even more competitive.

1 Like

Thanks for the continued explanation @mtanneau.

Yes, maybe that example with the instance of 80s -> 11m -> 24 seconds was no the best. It was the one in my mind because I used it as a thermometer during the modifications exactly because it should not take much time if it was running “fast enough”. Other instance from the top of my mind, however, takes ~2m now, and was taking 35~45m before the first “fix” (at that point the problem was the presolve disabled) and after it too (at this point the problem was the increased number of iterations). But, yes, the performance variation from the seed is greater than I would like it to be. I do not know I will be able to consider it deeply in the paper I am writing right now, but for my thesis I hope have more repetition and a section on it.

Also,

I’ve seen some ~20% performance variation (for pure LP) just depending on whether other jobs are running on the machine!
in my masters, I had some instances of a very memory-intensive problem (using an ad hoc algorithm, not a solver) taking more than the double of the time if I was solving 4 instances in parallel in the machine instead of having that run alone in the machine. Since there I do not parallelize my experiments.

the barrier algorithm is much better against degenerate problems than simplex.

I already stumbled on this. I will have a section on my paper in which I present a table and say why I decided to use barrier for root node relaxation solving, but not for the iterative LP growing procedure. As I said before in this thread (weeks ago, it would be unfair to expect you to remember):

changing the algorithm to barrier makes a >60 seconds run hit the 1 hour time limit, because these runs benefited from the basis start. I literally have a latex table with those times because I presented it to my advisor last week.

If you don’t need the basis, switch off crossover.

I think I need the basis, because I need the dual value of the constraints, right?

BTW, if you are adding/unfixing many variables, say, > few hundreds, the previous basis may not be that good anyway. That makes barrier even more competitive.

Unfortunately, this does not happen. I have many instances that I am adding tens of thousands of variables per LP solve, and the process gets much slower if I use barrier. The basis seems to help even in such cases.

changing the algorithm to barrier makes a >60 seconds run hit the 1 hour time limit

:sweat_smile: I had indeed forgotten about that

Barrier is also called “primal-dual interior-point” => as the name suggests, you can get the duals from the barrier. :slight_smile: The only case where Gurobi will error is if you have an infeasible or unbounded problem with crossover de-activated, and you query the solution. (FYI, Mosek and Tulip will return primal or infeasibility certificates).
However, without crossover, the solution will be an interior point: you can reasonably expect it to be denser than a basic solution, and that it will have small non-zero components (e.g., between 1e-8 and 1e-12).