Memory Allocation Increases with Iterations

GeoKou · April 16, 2023, 1:32am

Hello,

I have a function that generates a number of graphs to extract some features from each, in a for loop. Since the graph is created and the features are extracted, its information is not needed in the next iteration of the loop. However, it seems, that the adjacency matrix’s memory allocation is not freed, resulting in an OOM error when creating multiple large graphs.

function test(n::Int64, e::Int64, samples::Int64)
    for i in 1:samples
        g = erdos_renyi(n, e)
        A = adjacency_matrix(g)
        d = degree(g)
    end
end

This is the basis of my code. When benchmarking this function, the memory allocation doubles when the iterations (samples) is doubled.

Any help to only allocate memory needed for 1 iteration, would be much appreciated!

Thank you in advance.

mikkoku · April 16, 2023, 7:02am

Is the OOM only happening in a shared cluster environment? If that is the case then Garbage collection not aggressive enough on Slurm Cluster might be relevant.

Did you check that 1 sample does not cause OOM? In your code there is nothing that should keep the garbage collector from freeing the memory from the last iteration.

GeoKou · April 16, 2023, 9:27am

Hey, thank you for the response.

The code is run locally on my laptop.

Running the following:

function test(n::Int64, e::Int64, samples::Int64)
    for i in 1:samples
        g = erdos_renyi(n, e)
        A = adjacency_matrix(g)
        d = degree(g)
    end
end
n = 1000
m = 10000
s1 = 10
s2 = 20

test(n, m, s1)
@btime test($n, $m, $s1)
@btime test($n, $m, $s2)

yields the following results:

  11.513 ms (50135 allocations: 12.07 MiB)
  23.079 ms (100281 allocations: 24.14 MiB)

It makes sense to be some allocations the way the Graphs.jl package works, but I would expect for the allocations to be for 1 iteration of the loop only…

mikkoku · April 16, 2023, 1:35pm

This is expected behaviour. In general, pre-allocating outputs helps (see Performance Tips · The Julia Language). Unfortunately, I’m not aware of how to do that with Graphs.jl.

Regarding the OOM error, the increase of the reported allocations does not imply that something is not freed between iterations. The number is the total memory allocated during the call, not the amount of memory required for the computation.

goerch · April 16, 2023, 4:32pm

In your MWE the variable i isn’t used in the for loop. Looks to me like you should able to simply eliminate the loop?

GunnarFarneback · April 16, 2023, 5:00pm

It seems kind of far-fetched to eliminate the loop from an MWE which is intended to show a problem with multiple iterations, doesn’t it?

Ignoring that fact, there is not a problem that i isn’t used since erdos_renyi produces different random graphs in each iteration. Potentially a sufficiently smart compiler could decide that there is no need to compute A and d and eliminate them itself, but I don’t think Julia can determine that the called functions are free from side effects.

goerch · April 16, 2023, 5:06pm

Ouch, so

is the problem as far as we can tell for now?

lmiq · April 16, 2023, 5:38pm

Seems that two things are being mixed here:

This may be an issue in the package. You can try to minimize the chances of getting a OOM error by calling GC.gc() after each iteration. Yet, if for some reason the reference is not freed inside the package function, this won’t work.

This is another thing. If the function you are calling within the loop allocates, that is expected. But that should not result in a OOM error because GC should be called. In principle this is not a sign of a problem, nor is necessarily related to the problem above.

gdalle · April 16, 2023, 6:12pm

Hi there @GeoKou, Graphs.jl maintainer speaking. Indeed, Graphs.jl doesn’t offer many opportunities for pre-allocation. But in your simple example, if all you really need is the adjacency matrix of an ER graph, I strongly suggest you generate it by hand, and update it in-place.

@lmiq can you elaborate on what you mean by “reference freed inside the package function”? Not sure how we can improve this on our side

lmiq · April 16, 2023, 6:45pm

For example if g is mutable and growing at each call to the function, or of some memory is leaking by an improper use of low level unsafe memory management. Yet I think that’s unlikely. I cannot test now, but we would have to confirm that memory is being filled up from iteration to iteration there.

Memory being allocated is normal, but that shouldn’t normally result in OOM erros in the loop if that does not happen in a single call of the function.

gdalle · April 16, 2023, 7:27pm

I also think that’s unlikely, but then the cause of the OOM remains elusive

lmiq · April 16, 2023, 8:59pm

Here I don’t see any scale up of the memory. I’m not sure if there isn’t a misconception here. One iteration of that loop allocates ~1.2Mb. If one performs 10 iterations, it allocates ~12Mb, as expected, that seems perfectly consistent.

What may be causing confusion is that maybe the OP thinks that those 12Mb (or greater) are simultaneously allocated, which they are not.

In my tests there is not a build up on memory usage. For instance, with the n and m provided the function uses 3% of the memory of my laptop, independently of the number of iterations.

Maybe the OP has seen a OOM error for systems that are too large for the computer memory to handle a single (or very few iterations), such that garbage collection does not run often enough. That is probably solved by adding GC.gc() to the end of the iteration, if the computer runs one iteration fine.

GeoKou · April 18, 2023, 8:54am

Hello all,

Thank you for the replies and the insight on this issue.

By calling the garbage collector manually (GC.gc()) the problem was fixed, and after many tests, there was no OOM error.

Oscar_Smith · April 18, 2023, 9:48am

this really shouldn’t happen. it sounds like a bug where we aren’t triggering GC when we should.

GeoKou · April 18, 2023, 10:15am

This was my first thought, from what I’ve read I was expecting the memory to be freed before an OOM occurs. However, I tried many times without the GC.gc() command and the same OOM error occurs, whereas it is OK when the GC.gc() command is called.

jishnub · April 18, 2023, 12:18pm

This should not be the required solution. If one needs to do this, something is going wrong inside Julia. Would you mind posting an issue on Julia’s GitHub issue tracker, mentioning the code in detail so that the devs can reproduce the issue?

Topic		Replies	Views
Accumulating memory in nested loops General Usage performance	10	2204	April 17, 2017
Memory hogging loops Performance	18	528	October 22, 2022
Unexpected allocation at each loop iteration Performance question	8	394	June 11, 2022
Allocations when creating array Performance parallel , multithreading , memory-allocation , arrays	16	378	June 27, 2023
Understanding memory usage in Julia Performance memory , memory-allocation	20	2953	July 25, 2023

Memory Allocation Increases with Iterations

Related topics