Avoid calling push! in Base.find

Diego_Javier_Zea · January 6, 2017, 10:06pm

I don’t understand why the actual find function fills a 0 length array using push! to finally create other array and copy data to it. However, I write this version which avoid calling push! and it’s faster than the actual implementation. Should I make a PR with this implementation? Best,

julia> using BenchmarkTools

julia> function find2(testf::Function, A)
           len = 0 
           inds = Base._index_remapper(A)
           out = Array(Int, length(inds))
           for (i,a) = enumerate(A)
               if testf(a)
                   len += 1
                   out[len] = inds[i]
               end
           end
           resize!(out, len)
           return I
       end
find2 (generic function with 1 method)

julia> @benchmark find(x -> x > 0.5, rand(10000))
BenchmarkTools.Trial: 
  memory estimate:  244.61 kb
  allocs estimate:  17
  --------------
  minimum time:     130.360 μs (0.00% GC)
  median time:      143.276 μs (0.00% GC)
  mean time:        152.181 μs (3.08% GC)
  maximum time:     694.662 μs (76.48% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark find2(x -> x > 0.5, rand(10000))
BenchmarkTools.Trial: 
  memory estimate:  156.41 kb
  allocs estimate:  4
  --------------
  minimum time:     79.431 μs (0.00% GC)
  median time:      83.703 μs (0.00% GC)
  mean time:        89.274 μs (2.98% GC)
  maximum time:     558.110 μs (72.61% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

stevengj · January 6, 2017, 11:48pm

This implementation pre-allocates an index array of the same size as the whole input array. The Base.find function only allocates storage proportional to the size of its output.

Of course, in cases like your test case where the output array is half the size of the input array, this doesn’t save any memory, but in cases where the find output is much smaller it makes a difference.

Note that when you benchmark, you should interpolate the arguments of the function you are benchmarking, for example:

julia> A = rand(10^6);

julia> @benchmark find(x -> x < 0.01, $A)
BenchmarkTools.Trial: 
  memory estimate:  334.03 kb
  allocs estimate:  16
  --------------
  minimum time:     931.427 μs (0.00% GC)
  median time:      975.280 μs (0.00% GC)
  mean time:        1.004 ms (1.67% GC)
  maximum time:     2.772 ms (50.97% GC)
  --------------
  samples:          4975
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark find2(x -> x < 0.01, $A)
BenchmarkTools.Trial: 
  memory estimate:  7.63 mb
  allocs estimate:  2
  --------------
  minimum time:     870.728 μs (0.00% GC)
  median time:      926.463 μs (0.00% GC)
  mean time:        1.021 ms (11.63% GC)
  maximum time:     2.344 ms (27.54% GC)
  --------------
  samples:          4890
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

In this case, there is only a 7% performance penalty to the Base.find strategy of incremental allocation of the output, and it saves a factor of 20 in memory allocation.

Diego_Javier_Zea · January 7, 2017, 12:20am

Yes, It’s true! find2 is only fine when there are a lot of true elements or for short arrays where allocations aren’t a problem. Thanks!

Topic		Replies	Views
Why is push! still so slow? Internals & Design	39	1756	March 29, 2024
Push! vs pushfirst! performance Performance vector	5	3582	March 2, 2020
Benchmarking a function with side effects General Usage benchmark , benchmarktools	2	783	October 17, 2020
How to benchmark append!? General Usage	13	966	November 17, 2021
How slow push! is? General Usage question , faq	3	1909	November 18, 2016

Avoid calling push! in Base.find

Related topics