Large Garbage Collection in for loop if slicing vector

junder873 · March 21, 2022, 5:31pm

When creating a slice of a vector in a for loop, it seems like Julia creates a new vector each time which then needs to be dealt with by the garbage collector. In my setting, sometimes the GC can significantly slow down the program. As a MWE:

function f(data::Vector{Float64}; n=1000)
    out = fill(0.0, n)
    for i in 1:n
        r = rand(1:length(data)-101)
        x = data[r:r+100]
        out[i] = sum(x)
    end
    out
end
@benchmark f_vec(data) setup=(data=randn(10000))
#  BenchmarkTools.Trial: 10000 samples with 1 evaluation.
#  Range (min … max):  103.800 μs …  1.461 ms  ┊ GC (min … max): 0.00% … 80.05%
#  Time  (median):     111.400 μs              ┊ GC (median):    0.00%
#  Time  (mean ± σ):   131.199 μs ± 80.903 μs  ┊ GC (mean ± σ):  6.53% ± 10.09%

#   █▇▄▃▄▂▂▁ ▁▁                                                  ▁
#   ████████████████▇█▆▆▇▆▅▅▄▅▅▁▃▅▃▃▄▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▃▅▅▇▇▆▆▅▆▆▆ █
#   104 μs        Histogram: log(frequency) by time       584 μs <

#  Memory estimate: 882.94 KiB, allocs estimate: 1001.

The median of that timing is fine, however, the max time with 80% GC creates some problems for me. For example, if I change to n=10_000_000, then it takes 30 seconds to run instead of the 1 second implied by the median time. I can also fix this by changing to x = @view data[r:r+100]. However, in my more specific setting I cannot guarantee that it will always be continuous slices (so it might be a random set of indices).

So my main question is, why does Julia create a new vector each time for x and then need to collect it? Is there a way to specify that I want to overwrite the same place in memory? Would that help? Is there another way to solve this?

Jeff_Emanuel · March 21, 2022, 5:42pm

Generally, pass an iterator to sum.

kristoffer.carlsson · March 21, 2022, 5:57pm

You can use a view even with arbitrary indices.

giordano · March 21, 2022, 5:58pm

Like adding a “buffer” argument?

julia> function g(data::Vector{Float64}, buffer::Vector{Float64}; n=1000)
           out = fill(0.0, n)
           for i in 1:n
               r = rand(1:length(data)-101)
               buffer .= @view data[r:r+100]
               out[i] = sum(buffer)
           end
           out
       end
g (generic function with 1 method)

julia> @benchmark g(data, buffer) setup=(data=randn(10000); buffer=Vector{Float64}(undef, 101))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  55.861 μs … 165.260 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     57.117 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   60.518 μs ±   6.547 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄█▆▂▃▄▄▄▄▃             ▁▅▅▂                                  ▁
  ███████████▆▆▆▆▅▆▆▆▆▆▇▆███████▇▆▆▆▆▅▆▄▅▅▆▆▆▆▆▆▅▅▆▄▆▅▅▆▅▆▆▄▆▅ █
  55.9 μs       Histogram: log(frequency) by time      86.1 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

But just using x = @view ... is probably simpler.

junder873 · March 21, 2022, 6:04pm

Yes, just as I understand it, sometimes doing so can be slower since the slice is not continuous in memory. Though, in my setting, it might be worth it to avoid GC.

junder873 · March 21, 2022, 6:11pm

I was thinking more like:

function f(data::Vector{Float64}; n=1000)
    out = fill(0.0, n)
    x = Vector{Float64}(undef, 101)
    for i in 1:n
        r = rand(1:length(data)-101)
        x .= data[r:r+100]
        out[i] = sum(x)
    end
    out
end

But that doesn’t result in any differences.

Is there any benefit in passing buffer in yours?

giordano · March 21, 2022, 6:13pm

Only in case you want to reuse the same buffer in other functions, if you use this pattern in different places. Otherwise not really. And I’m still not sure it’s any better than just using x = @view ....

junder873 · March 21, 2022, 6:13pm

In this case, sum is more of an example than what I am actually trying to do.

stevengj · March 21, 2022, 7:29pm

This still allocates a copy of the data slice on the right-hand-side before writing it to x.

DNF · March 22, 2022, 12:47am

If you must collect the data into a buffer, then

buffer .= @view data[inds]

is the way to go. There’s normally no cost to creating the view, this should just directly retrieve the data points into the buffer.

But if you subsequently iterate just a single time over the buffer, like for example with sum(buffer), then this is just a waste of time, and directly summing the view sum(@view data[inds]) is as good as it gets. Then you just traverse the data a single time, summing as you go. No need to traverse the data twice if once will do.

If the indices are non-contiguous and you will access them multiple times, collecting them in a buffer might be worth it.

Based on your example, a view seems to be exactly what you want.

stevengj · March 22, 2022, 1:00am

PS. If you have an array of indices (instead of a range), it might be worthwhile to forgo the index array entirely, and just compute the desired indices/elements one by one in a loop (either to store into a buffer for subsequent calculations, or to directly calculate something like a sum with them). There’s no need to bend over backwards to avoid writing your own performance-critical inner loops.

Topic		Replies	Views
GC occurs at the worst time in tight loop (Garbage Collection) Performance question	93	3201	November 7, 2023
Garbage collection New to Julia question	10	5572	June 15, 2020
Help with optimizing GC time with large objects in memory Performance	3	651	November 10, 2018
Assign vector to slices and allocation Performance memory-allocation	7	1055	November 23, 2020
Benchmarking questions General Usage benchmark	3	318	September 18, 2023

Large Garbage Collection in for loop if slicing vector

Related topics