Large scale model using Gurobi causes machine crushed

Betristor · October 30, 2024, 11:56am

For the record, I still have not found the reason why my server crushed solving some large models using JuMP+Gurobi. But I have experienced like dozens of times.
Here is the situation. Each time I constructed a JuMP model with about 200B variables with like three instances (each takes about 250 GB using direct mode, the total memory consumption is within the machine limit 1.5TB), the machine will crush and no traces are left in the logs from the system and gurobi. It’s just like a random thing that I could not control at all, and I don’t know when it will happen, how it happened, and how to fix this. I had to reboot the machine since no ssh and cable connection is available when this happens.
It’s killing me! Is there anybody who has encountered the same issue with me?

torressa · October 30, 2024, 5:45pm

I encountered similar issues a while ago but it was not related to either Gurobi/JuMP and was likely a Julia issue (I was able to build and solve the model effectively once I migrated to another language).
I’m guessing this is during the model-building phase. Have you tested other solver interfaces?

odow · October 30, 2024, 8:26pm

Gurobi.jl wraps the 32-bit Gurobi API, so you cannot have more than 2_147_483_647 variables.

Betristor · October 31, 2024, 2:02am

Normally the model building is smooth, the machine crushed during the solving process.

Betristor · October 31, 2024, 2:10am

@odow Is there any future plan that Gurobi.jl will extend this limit? Since gurobi didn’t limit the variables and supports 64-bit. Another question is does this mean the underlying float precision is restricted to 32-bit or something else?

odow · October 31, 2024, 2:22am

There are no plans.

It doesnt make sense to try to solve a problem with that many variables. Even if you could build it, Gurobi is unlikely to solve it.

What is your application? How can you interpret a problem with 10^11 decision variables.

Betristor · October 31, 2024, 2:29am

@odow I constructed a problem with 8760 time slots and over 1000 locations. The high spatiotemporal resolution results in a very large variable space which is necessary for me to find the temporal dynamics and precise spatial location.
Now since gurobi supports 2147483647 variables at most, I think I do need to reduce the resolution.

odow · October 31, 2024, 2:33am

So you have 8.76e6 time/location slots. But that doesn’t explain how you the have an additional 1e5 variables for each of the time/location pairs?

Betristor · October 31, 2024, 2:35am

1e11 is now in hypothesis. The most common situation is about 1.9e8 variables.

odow · October 31, 2024, 2:37am

To clarify the future plans: there are GRBX routines which allow more than 2e9 non-zeros in the constraint matrix: GRBXloadmodel - Gurobi Optimization. But they still don’t support more than typemax(Cint) variables or constraints.

odow · October 31, 2024, 2:39am

In that case you’re likely hitting the limit of the number of non-zeros in the constraint matrix.

Betristor · October 31, 2024, 2:40am

Thanks for the replying. This is of great help. Now I finally know where the boundary is.

Topic		Replies	Views
Invalid Gurobi Licence when instance size increases using lazy constraints to generate cuts Optimization (Mathematical)	3	1285	March 13, 2018
Adding a few variables and constraints makes the model nearly unsolvable Optimization (Mathematical) gurobi	11	1601	December 28, 2021
Simple "coin" example will not work with JuMP + Gurobi Optimization (Mathematical) question , jump	1	943	July 3, 2017
JuMP : Solving an optimisation problem with a large matrix of variables Optimization (Mathematical)	4	2372	November 24, 2019
JuMP: Catch error if solver stops early due to network error or running out of memory ? Optimization (Mathematical)	1	860	February 19, 2019

Large scale model using Gurobi causes machine crushed

Related topics