Memory sizing and GC behavior


#1

I have a function that allocates a lot and hence every trial requires some time for GC. I’m just wondering if there’s any way to tune the system such that it has more heap memory and avoid having to GC so often?

BenchmarkTools.Trial:
  memory estimate:  852.61 MiB
  allocs estimate:  59101
  --------------
  minimum time:     1.566 s (35.72% GC)
  median time:      1.609 s (35.45% GC)
  mean time:        1.609 s (35.82% GC)
  maximum time:     1.754 s (40.24% GC)
  --------------
  samples:          19
  evals/sample:     1

#2

You can disable the garbage collector for a particular block of code with gc_enable(false) and then re-enable it with gc_enable(true) and you can even force garbage collection with gc(). That said, I’ve not personally found any cases where doing so actually helped my code.


#3

Allocations are may be an indicator of suboptimal code, so that it is worth optimizing in itself, but if some allocation is inherent to the algorithm, you can’t do much about it.

It is hard to say more without an MWE, but I would try StaticArrays (first choice) or pre-allocated buffers if possible.


#4

@rdeits If I disable GC completely then wouldn’t it be possible to run out of system memory causing a crash?

@Tamas_Papp I have a very large SharedArray that totals about 30 GiB in size. I need to take a subset of the matrix (up to half of its size) and perform a series of matrix computation. I could use a view but it makes the computation slower. Pre-allocation is a decent idea but I would have to wrap it to support variable sizes. Know any good package for that?


#5

No. In my experience coding these directly is the path of least resistance.

(For 30 GiB objects, please disregard the suggestion for StaticArrays :smile:)


#6

Yes, and what is worse this could be slower, at least in theory. [Slower, as in throughput; I would only consider doing this in soft-real-time situations, and then performance may be more even, but I’m not really convinced of that either for the reason below. Best would be to get rid of [heap] allocations in that case, and then GC isn’t a problem and needs not be disabled.]

Julia’s GC is generational. With the GC on, you could be freeing memory in the younger generation that is likely to be in cache, so new allocations should go there. Disabling GC, you have no choice other than allocating more and more memory, eventually allocating more than fits in cache; this would happen before running out of memory, but would make your program slower.