Reusing buffer arrays by providing them as arguments to functions

jonas-schulze · March 16, 2019, 5:22pm

I have a function that needs a Vector{Int} for buffering, which I would like to reuse when I call that function over and over again in a loop. My idea was to create that vector outside and clearing it before every use. I had hoped that empty! doesn’t change the vector’s capacity (similar to C++) and would have constant complexity (omit calling Int destructors) without allocations (simply overwriting some length attribute).

However, empty! isn’t “free”:

julia> tmp = Int[];

julia> @time empty!(tmp);
  0.000036 seconds (4 allocations: 160 bytes)

How do I prevent allocation when clearing buffers?
What is the proper way to reuse those buffers?

A more elaborate example: consider computing the prime factorization of an integer

function factorize(n,
                   primes = Primes(2_000_000),
                   factors = Int[],
                   exponents = Int[])
  n >= 4 || return [n], [1]

  limit = floor(Int, sqrt(n))
  for p in primes
    p <= limit || break
    n % p == 0 || continue

    push!(factors, p)
    push!(exponents, 0)
    while n % p == 0
      exponents[end] += 1
      n ÷= p
      n % p == 0 || break
      exponents[end] += 1
      n ÷= p
      limit ÷= p
    end
    n != 1 || break
  end

  if n != 1
    push!(factors, n)
    push!(exponents, 1)
  end

  return factors, exponents
end

which is called very often when looking for the first number x = bar(n) having at least 500 divisors:

function foo()
  primes = Primes(2_000_000)

  # buffers
  f = Array{Int}(undef, length(primes)÷2)
  e = similar(f)

  n = 1
  x = bar(n)
  _, e = factorize(x, f, e)
  while numdivisors(e) < 500
    empty!(f)
    empty!(e)
    n += 1
    x = bar(n)
    _, e = factorize(t, primes, f, e)
  end
  x
end

An alternative would be to rewrite factorize to be used like for (f, e) in primefactors(n, primes) but that would be a rather big undertaking. Another way would be to replace all the push!es and do all the index operations manually (plus returning the length in some form as resize! isn’t free either).

tkluck · March 16, 2019, 5:29pm

However, empty! isn’t “free”

It actually is; you’re seeing some overhead from using tmp as a global variable. Here’s a better way to benchmark:

julia> tmp = Int[];

julia> using BenchmarkTools

julia> @btime empty!($tmp)
  3.054 ns (0 allocations: 0 bytes)
0-element Array{Int64,1}

kristoffer.carlsson · March 16, 2019, 5:52pm

Also, running @show itself in global scope will allocate.

bennedich · March 16, 2019, 8:01pm

This might be how it’s currently implemented, but is that behavior documented? If not, I’d be quite wary of relying on that.

I would instead do something similar to your final suggestion (indexed operations), but instead of doing it all manually, create a small wrapper around an array and use that as a reusable buffer. Indexed operations are also slightly faster than push!, since push! does a library ccall for each invocation. Usually that’s not something you need to worry about, but since you’re concerned about allocations and buffering in the first place, it sounds like you’re after maximum performance.

You can read about the performance issue associated with push! here, and on that page there’s also a sample implementation of a wrapped array (PushVector) that might serve as inspiration.

Before refactoring too much though – please make sure that you’ve benchmarked your code and established that these allocations are actually a bottleneck and the right thing to optimize. I’m a bit concerned that you fret over 160 bytes allocated which doesn’t even seem to be in a hot code path.

jonas-schulze · March 16, 2019, 9:08pm

Thanks for the hints! I was wrong about where the allocations came from. After some proper profiling I discovered that return factors, exponents was the issue, so I refactored my code:

function factorize(n, primes = Primes(2_000_000))
  f = Int[]
  e = Int[]
  factorize!(f, e, n, primes)
  return f, e
end

function factorize!(factors, exponents, n, primes)
  # deleted: n >= 4 || return [n], [1]
  if n < 4
    push!(factors, n)
    push!(exponents, 1)
    return
  end

  # ...

  # deleted: return factors, exponents
  nothing
end

jonas-schulze · March 17, 2019, 9:43am

No, it’s not documented I think. Calling empty! eventually leads to jl_array_del_at_end(a, n - dec, dec, n); (in src/array.c). My guess was that it behaved as I assumed originally, but I didn’t research how STORE_ARRAY_LEN is used/resolved.

Yes I am, I really enjoy squeezing all the performance I can get out of my code (without “hacking” too much ). I’m learning the language while solving some Project Euler riddles.

Your link mentions some effort porting arrays to Julia. As I understand it, the whole runtime is written in C. Are there any plans to port the runtime to Julia?

Nice! All that is left is

@inline function Base.setindex!(v::PushVector, val, i)
  @boundscheck checkbounds(v, i)
  @inbounds v.v[i] = val
end

function Base.empty!(v::PushVector)
    v.l = 0
end

function Base.similar(v::PushVector)
  PushVector(similar(v.v), 0)
end

which leads to a ~5% performance increase in my example (measured after the ~50% increase due to reusing the buffers). In another riddle I earned ~18% by using PushVectors.

Topic		Replies	Views
I want to write a better performance array, don't know if there is a ready-made one? Performance	3	491	January 14, 2021
Shrinking and expanding vectors New to Julia	4	1384	November 28, 2017
Gc and resize! General Usage question	12	870	December 7, 2018
What is the most idiomatic way to allocate a (reusable) buffer for a function? Performance	10	375	October 12, 2023
How to preallocate and reuse buffers for oft-repeated computation? General Usage question , performance , memory-allocation , profiling , speed-optimization	15	1520	July 12, 2023

Reusing buffer arrays by providing them as arguments to functions

Related topics