Thanks for all your fast replies!
I’ll try to get back at all the important points and update with what I’ve been able to dig into further:
Does that time (< 2s) makes a difference in your case?
The two seconds do not matter here, but I am looking to construct an energy systems model that will be run iteratively with live data updates and has a pretty tight response window (new data every 15 minutes, so only a few minutes for constructing + solving). Since I can only partially optimize the time it takes Gurobi to solve the problem (I can construct the model in a way that the solve gets faster, but other than that not that much that I can do), it is really relevant if constructing the model takes 30s or 3minutes. (see also my answer below regarding the Python code)
Playing with my new favourite toy
Is there anything I could do with that JET report? Is there anything “wrong” with the way I construct the variables / constraints?
So what would be the preferable way of comparing the time here? I’ll start with some ideas:
- using the same solver in both Python & Julia
- measuring not only the time to build the model, but end-to-end, including the time to pass the model to the solver and solving time
- look at the (self-reported?) time that the solver itself needs, and subtract it from the overall time.
- try to exclude the start-up time of the processes, and compilation time?
- I’m using the same solver in Python & Julia (Gurobi or sometimes GLPK on test-machines where I don’t have access to a Gurobi licence).
- I’m always a bit reluctant to include solving time because that can vary a lot with one implementation randomly representing the specific model in a way that makes the solve faster (and therefore you can’t generalize onto larger/other models). But @odow’s idea (setting a time limit) seems to circumvent this mostly (as far as I understand) and I’ll be using that from now on!
- start-up, compilation, … times are always excluded, otherwise the comparison would not be fair (one wouldn’t include Cython compilation times e.g., so no reason to include precompile times)
[…] Some of the python packages can build a symbolic model without populating the data […]
I am using a customized version of Pyomo that strips all non-linear stuff from it (there is a lot of unecessary time-loss due to generalism) and optimizes a few things here and there for better performance after cythonizing the whole project. So I was doing “create + write to file” as comparison (since that gets recommended by some papers), but after reading your input and the drawbacks write_to_file comes with, I’ve changed that to a more fair comparison (see below).
We’ve done benchmarks with a time limit of 1ms.
I’ve now changed my code to adopt that idea, just with a limit of 1s (because I can only pass an integer timelimit to my GLPK test code using the Python implementation). I’ve previously only looked at “model creation time” and the time it takes writing the model to a file (since some papers suggest that to be a fair comparison), but as you wrote in this other post, that can be potentially misleading. This greatly improved the comparison in favor of JuMP.
But a few things still irritate me, which could all be related to garbage collection (?):
Timings for various sizes n
are
| time to construct | time to "solve" |
-----+---------+---------+---------+---------+
n | pyomo | jump | pyomo | jump |
-----+---------+---------+---------+---------+
100 | 11s | 4s | 90s | 9s |
200 | 22s | 7s | 182s | 18s |
500 | 52s | 22s | 450s | 97s |
1000 | 104s | 41s | 813s | 340s |
2000 | 204s | 117s | 1654s | 677s |
While Pyomo seems to be scaling somewhat linearly, JuMP had a big jump in observed times. But with factors now going from 10:1 to less than 2:1 for total time pyomo:jump, I am still unsure how it could scale with more complicated models. I again took a look at the average amount of time spent on GC:
| GC % of total time |
-----+----------------+---------------+
n | model creation | model "solve" |
-----+----------------+---------------+
100 | 40% | 20% |
200 | 30% | 20% |
500 | 40% | 60% |
1000 | 40% | 80% |
2000 | 55% | 80% |
It seems like garbage collection heavily influences this. This happens (to some extent) less in the Python implementation. Why? Because I am actively pausing the garbage collection there and only allowing it at very specific times (for a model consisting of n
blocks, it is easy to approximate the memory size of each block without garbage collection and only “cleanup” when it’s necessary).
I’ve again tested with a disabled GC by doing:
GC.gc()
GC.enable(false)
@time (model = create_model(size))
GC.enable(true)
GC.gc()
GC.enable(false)
@time (solve_model!(model))
GC.enable(true)
This leads to the following (larger n
omitted due to the fact I only have 30GB RAM available during testing; Pyomo timings identical for reference; time to “solve” refers to passing it to GLPK with a timelimit of 1s):
| time to construct | time to "solve" |
-----+---------+---------+---------+---------+
n | pyomo | jump | pyomo | jump |
-----+---------+---------+---------+---------+
100 | 11s | 2s | 90s | 7s |
200 | 22s | 5s | 182s | 13s |
500 | 52s | 12s | 450s | 34s |
1000 | 104s | 22s | 813s | 70s |
One can see here, that now (without GC) JuMP scales exactly linearly with problem size.
This leads again to the question: Is there a way to configure the GC or influence how often/when sweeps (or however it works) are done?