Reducing (time spent in) garbage collection

sstroemer · December 7, 2021, 3:13pm

I’m using the code posted below to create a simple JuMP model with variable size. Depending on the size and the way I construct the problem this induces somewhere between 20-80% of the total time spent in garbage collection (with problems of a relevant size always spending > 80% in GC). For more detailed timings see this post.

To circumvent this I disabled the garbage collector and am trying to manually trigger it in constant intervals. This

reduces the time spent in GC to around 25% of total time
seems to mess up something since I run out of memory AFTER re-enabling the garbage collector, doing GC.gc(true) and trying to pass the model to a solver (with automatic GC this does not happen, so the model and all memory the solver needs easily fits into RAM normally)

Now I have found some posts regarding garbage collection (which did not really conclude in anything related) and close to no real information online on how it could be “influenced” from the outside (besides turning it on/off). Is there any way to parameterize the intervals it normally runs at, etc.?

Is there anything “non JuMP related” that I can do to improve this? Is there any efficient way to track the current amount of memory allocation; and is it possible to use this to trigger a full GC only when it is really necessary?

Basically I am looking to reduce total computational time by reducing time spent doing garbage collection to a minimum. If I know that I have 512 GB of RAM I could risk using 500 of those and only then triggering a single GC until I hit that mark again.

Current code that is used to construct the JuMP model with garbage collection every 100 “blocks”:

using JuMP
using GLPK

"""
	Create a simple JuMP model with a specified size
"""
function create_model(size)
    T = 1:10000
    T2 = 2:10000
	
	GC.gc(true)
	GC.enable(false)
	model = JuMP.direct_model(GLPK.Optimizer())
	set_time_limit_sec(model, 0.001)

	for start in 1:100:size
		I = start:(start+99)
		x = @variable(model, [I, T], lower_bound=0, upper_bound=100)
		@constraint(model, [i in I, t in T2], x[i, t] - x[i, t-1] <= 10)
		@constraint(model, [i in I, t in T2], x[i, t-1] - x[i, t] <= -10)
		@objective(model, Min, sum(x[i, t] for t in T for i in I))
		
		GC.enable(true)
		GC.gc(false)
		GC.enable(false)
	end
	
	GC.enable(true)
    return model
end


"""
	Solve the model (this actually just passes it to GLPK due to the timelimit)
"""
function solve_model!(model)
	JuMP.optimize!(model)
end


for i in [100, 200, 500]
    println("Size: $i")

    @time (model = create_model(i))
	@time (solve_model!(model))
end

Edit: changed using T[2:end] in the loop to preparing T2 = 2:10000 and using that instead, as correctly suggested by @Jeff_Emanuel.

Jeff_Emanuel · December 7, 2021, 4:04pm

This makes a copy. Try using a view instead, or better, just initialize a T2 = 2:10000.

sstroemer · December 7, 2021, 4:34pm

Thanks for pointing out that stupid mistake, I’ve fixed that now (and I’ll edit it in in the initial post to not confuse future readers).

However, this does not change the timing or gc usage in a meaningful way for bigger problem sizes (n \geq 1000), with only a few seconds difference and still approximately 25% of the total time spent in garbage collection.

Topic		Replies	Views
Garbage collection New to Julia question	10	5673	June 15, 2020
Ever increasing time for garbage collection Modelling & Simulations question	7	396	May 23, 2022
Understanding GC time Performance question	1	111	March 19, 2025
Optimal model creation (and garbage collection) Optimization (Mathematical)	20	1452	December 7, 2021
Triggerring GC unexpectedly New to Julia	9	382	September 5, 2022

Reducing (time spent in) garbage collection

Related topics