Optimal model creation (and garbage collection)

sstroemer · December 7, 2021, 2:43pm

Thanks again for the helpful inputs! I’ve done new tests and am at least partially successful. Reporting back with the new information below Short TLDR at the end of this post.

Using direct-mode should greatly decrease the GC pressure.

I have (naively as it shows) assumed that this could be a hint that the way JuMP works with “bridges” could be the culprit. So I’ve used the following to test again: model = JuMP.Model(GLPK.Optimizer; add_bridges=false), now constructing my constraints in a natively supported way:

@constraint(model, [i in I, t in T[2:end]], x[i, t] - x[i, t-1] <= 10)
@constraint(model, [i in I, t in T[2:end]], x[i, t-1] - x[i, t] <= -10)

This results in similar timings and GC usage, with model creation even taking a bit more time. So I discarded that thought and went on testing the direct mode as proposed. This results in the following timings

     | time to construct |  time to "solve"  |
-----+---------+---------+---------+---------+
n    | normal  | direct  | normal  | direct  |
-----+---------+---------+---------+---------+
100  | 4s      | 7s      | 9s      | 1s      |
200  | 7s      | 15s     | 18s     | 2s      |
500  | 22s     | 49s     | 97s     | 6s      |
1000 | 41s     | 180s    | 340s    | 12s     |
2000 | 117s    | 660s    | 677s    | 29s     |

and GC usage (note that “solve” now does not suffer from any garbage collection)

     | GC % of total time (normal)    | GC % of total time (direct) |
-----+----------------+---------------+-----------------------------+
n    | model creation | model "solve" | model creation              |
-----+----------------+---------------+-----------------------------+
100  | 40%            | 20%           | 35%                         |
200  | 30%            | 20%           | 35%                         |
500  | 40%            | 60%           | 50%                         |
1000 | 40%            | 80%           | 70%                         |
2000 | 55%            | 80%           | 85%                         |

This shows that

additional computational effort is now spent on “creating” the model (to be expected since it’s basically just one step compared to the disjunct steps before) and
that using direct mode is a bit faster (total time of 689s vs. 793s for n=2000) but for bigger models garbage collection still heavily impacts performance.

But there is one really important difference. Now that most of the computational time happens during model creation (a step that is mostly “user controlled” - compared to model solve taking a large chunk of total time that is effectively “hidden” in JuMP). This can be used to manually control when the garbage collector is allowed to run. I have change model construction to the following:

function create_model(size)
    T = 1:10000

    # disable GC and prepare direct mode model
    GC.enable(false)
	model = JuMP.direct_model(GLPK.Optimizer())
	set_time_limit_sec(model, 0.001)

	for start in 1:100:size
		I = start:(start+99)
		x = @variable(model, [I, T], lower_bound=0, upper_bound=100)
		@constraint(model, [i in I, t in T[2:end]], x[i, t] - x[i, t-1] <= 10)
		@constraint(model, [i in I, t in T[2:end]], x[i, t-1] - x[i, t] <= 10)
		@objective(model, Min, sum(x[i, t] for t in T for i in I))
		
        # after construction of 100 "blocks" do a single GC pass  
		GC.enable(true)
		GC.gc(false)
		GC.enable(false)
	end

    # re-enable GC and return model
    GC.enable(true)
    return model
end

So what am I doing here? I’m splitting up model creation into chunks of 100 “blocks” each. While doing this GC is disabled, between chunks I do a single pass of garbage collection. This can now be adapted to the specific needs (more often / less often). The new timings are (compared to “auto gc” being the direct mode timings from before):

     | time to construct   | time to "solve"     |
-----+-----------+---------+-----------+---------+
n    | manual gc | auto gc | manual gc | auto gc |
-----+-----------+---------+-----------+---------+
100  | 6s        | 7s      | 1s        | 1s      |
200  | 12s       | 15s     | 3s        | 2s      |
500  | 28s       | 49s     | 7s        | 6s      |
1000 | 65s       | 180s    | 15s       | 12s     |
2000 | 122s      | 660s    | ?s        | 29s     |

which shows a massive improvement with larger model sizes scaling in the same way as smaller ones.

[Why is there a “?”? I could not successfully pass the model with n=2000 to the solver due to running out of memory. Even doing GC.gc(true) before calling optimize!(model) resulted in a error. I am not that unsettled though, since I’m only using about 40GB of RAM currently and any “real” machine this would be running on would be much bigger in that regard.]

A look at the GC % confirms that (while it can still be seen, that garbage collection will get more complicating with bigger models AND that it still makes up a huge chunk of total computational time - around one fourth):

n    | GC % (manual gc) | GC % (auto gc) |
-----+------------------+----------------+
100  | 15%              | 35%            |
200  | 15%              | 35%            |
500  | 20%              | 50%            |
1000 | 25%              | 70%            |
2000 | 25%              | 85%            |

Current learnings:

Using GC.gc() requires the GC to be enabled (using GC.enable(true)) if it was turned off manually before. So no disabling and then just calling it. Overall it’s really difficult to find any kind of information regarding the Julia garbage collector.
Using direct mode reduces total time as long as garbage collection is not the predominant slowdown.
Shifting the main garbage collection pressure into a part of the code that can be (more easily) controlled (as in “not purely JuMP internal”) enables simply turning the GC off for some time.
Disabling bridges did not help at all. (maybe I did something wrong there?)
I am still unsure if this approach will scale onto much larger models since an increased GC % can still be observed.
I am unsure why the case with n=2000 actually runs out of memory, since changing up the chunk size does not influence that. It feels like some portion of memory is never free’d correctly by doing that manually, but …

Any other tips from a JuMP’s perspective? I will try to further dig into how GC actually works in Julia (and consulting the main Julia discourse, since I assume that’s a potentially better fit for that topic) because I do not think “wasting” 25% of the total computational time for garbage collection is a good target to aim for.