I need to conduct extensive memory and performance testing on a memory-intensive function that creates mutable structs, temporary arrays, and more, even though most of the variables are type-defined. Running a single instance of the function works fine; my 16GB of RAM can handle it. However, the problem arises when I repeatedly call this function to measure execution time and memory consumption in various sections of the code. To do this, I’m using TimerOutputs since adapting my code to use BenchmarkTools would be too cumbersome, considering its size.
The issue I’m facing is that memory consumption seems to accumulate between function calls, eventually causing my Ubuntu 20.04 system to crash. Consequently, I’m restricted in how many tests I can run before hitting a certain memory threshold, even though each individual test runs successfully, individually.
Here’s an example of what my code looks like:
function memuse() return parse(Int, split(read(`ps -p $(getpid()) -o rss`, String))[2]) / 1024 end
nrepeats = 10
log = Dict()
inputs1 = ProblemInput(1, 1000)
inputs2 = ProblemInput(2, 100)
problem1 = Problem(handle1, inputs1)
problem2 = Problem(handle2, inputs2)
problems = [problem1, problem2]
for i in 1:length(problems)
for input in problems[i].input
processed_input = process(input)
for j in 1:nrepeats
memuse = Helpers.memuse()
if memuse > 1.0e4
GC.gc(true)
ccall(:malloc_trim, Cvoid, (Cint,), 0)
if memuse > 1.3e4 # Threshold, more than this will crash
throw("Memory consumption too high!")
end
end
obj = problems[i].handle(processed_input)
save_measures(i, obj.data, log)
obj = nothing
end
end
end
In this code snippet, I iterate over a set of problems and inputs, repeatedly calling the corresponding ‘handle’ function within each problem to measure its performance. Inside these ‘handle’ functions, there is a significant amount of memory-intensive operations. Most of which are caused by temporary SArrays. The persistent data (e.g., Problems, Input1, etc.) does not significantly contribute to the problem. These data structures incur minimal memory allocation compared to the ones made within the ‘handle’ function. Unfortunately, refactoring the code at this point is not ideal, as it’s designed to closely match a Python and a MATLAB version for direct performance comparison.
The issue at hand is that even within the innermost for
loop, memory allocation accumulates with each iteration of the same ‘handle’ function. To illustrate, if a single function call allocates 1GB of memory and nrepeats = 10
, the final memory consumption would reach around 10GB. Moreover, even when the problem changes, the previously allocated memory from previous problems still lingers. This means that even for smaller inputs, when conducting numerous repeats, the program eventually crashes. To avoid the need to restart the computer, I’ve implemented the memuse
bit of the code.
So, my primary question is: Is there a way to prompt Julia or Ubuntu to promptly release the memory allocated during these function calls? I’ve tried GC.gc(true)
and ccall(:malloc_trim, Cvoid, (Cint,), 0)
, but their effect is very limited, typically freeing only around 100MB, even when the function allocates up to 2GB of memory on the largest input. Is there any solution beyond refactoring my code?
If you need more specific details or if my explanation is insufficient, please let me know, and I can provide further information or the complete code.