How to benchmark append!?

lmiq · September 25, 2020, 3:56pm

I am trying to benchmark a code that calls append, but the results do not seem to make sense. The minimal working example which I do not understand is:

julia> using BenchmarkTools
julia> x = [0]; y = zeros(100); @btime append!($x,$y)
  376.261 ns (0 allocations: 0 bytes)
235230201-element Array{Int64,1}:
 0
 0
...

As you can see, @btime returns an enormous vector.

This is part of me trying to benchmark append! vs. push! in a simple example:

using BenchmarkTools
using Test

# using push!
function f!(x)
  for i in 1:100
    push!(x,i)
  end
end

# using append!
function g!(x)
  xlength = length(x)
  append!(x,Vector{Int64}(undef,100))
  for i in 2:101
    x[i] = i-1
  end
end

# creating a new vector
function h(x)
  y = Vector{Int64}(undef,length(x)+100)
  for i in 1:length(x)
    y[i] = x[i]
  end
  for i in 2:101
    y[i] = i-1
  end
  return y
end

xf = [0] ; f!(xf)
xg = [0] ; g!(xg)
xh = [0] ; y = h(xh)
@test xf == xg == y

print("push!  :"); xf = [0]; @btime f!($xf)
print("append!:"); xg = [0]; @btime g!($xg)
print("new vec:"); xh = [0]; @btime h($xh)

With results:

push!  :  587.733 ns (0 allocations: 0 bytes)
append!:  437.322 ns (1 allocation: 896 bytes)
new vec:  79.581 ns (1 allocation: 896 bytes)

(I do not understand either how can push not allocate any memory)

My expectation was that append! would be the fastest alternative here, but it seems that the benchmarking does not make sense, considering the enormous ammount of allocations it is implying.

Sukera · September 25, 2020, 4:06pm

You’re constantly pushing into the same vector, it doesn’t get reset because of the interpolation.

kristoffer.carlsson · September 25, 2020, 4:07pm

You use the same array over and over. It’s not because of the interpolation, it wouldn’t get reset anyway.

lmiq · September 25, 2020, 4:09pm

I thought that the interpolation was exactly to make a copy of the input variable and avoid that…

And then, how can I benchmark that, without including into the function the first allocation?

edit:

@btime f!(copy($x))

does the trick? edit: of course not, then the copy is included in the benchmark.

tomerarnon · September 25, 2020, 4:10pm

You can use setup =

julia> begin
       print("push!  :"); @btime f!(xf) setup = xf=[0]
       print("append!:"); @btime g!(xf) setup = xf=[0]
       print("new vec:"); @btime h(xf) setup = xf=[0]
       print("resize :"); @btime k!(xf) setup = xf=[0]
       end

push!  :  821.942 ns (0 allocations: 1.86 KiB)
append!:  287.010 ns (1 allocation: 2.81 KiB)
new vec:  127.507 ns (1 allocation: 896 bytes)
resize :  166.051 ns (0 allocations: 1.59 KiB)

where k! is

function k!(x)
    N = length(x)
    resize!(x, N + 100)
    for i in N .+ (1:100)
        x[i] = i
    end
    x
end

I’m surprised allocating a new vector is actually fastest… Maybe there’s a lesson to be learned here.

lmiq · September 25, 2020, 4:12pm

Exactly, that is why I ended up in these examples.

(and good point, resize is another option here).

Using copy to input the function at the benchmark, and including the option of resizing, my benchmark is:

push!  :  550.066 ns (7 allocations: 2.22 KiB)
append!:  199.501 ns (3 allocations: 1.77 KiB)
new vec:  106.343 ns (2 allocations: 992 bytes)
resize :  88.237 ns (2 allocations: 912 bytes)

That resizing is fastest and push the slowest is expected. What I did not expect is that using append is so much slower than even creating a new vector. I would expect it to be almost identical to the resizing option, as I am appending a undef array.

Code:

using BenchmarkTools
using Test

const N = 100

# using push!
function f!(x)
  for i in 1:N
    push!(x,i)
  end
end

# using append!
function g!(x)
  xlength = length(x)
  append!(x,Vector{Int64}(undef,N))
  for i in 2:N+1
    x[i] = i-1
  end
end

# creating a new vector
function h(x)
  y = Vector{Int64}(undef,length(x)+N)
  for i in 1:length(x)
    y[i] = x[i]
  end
  for i in 2:N+1
    y[i] = i-1
  end
  return y
end

# Resizing
function m!(x)
  resize!(x,length(x)+N)
  for i in 2:N+1
    x[i] = i-1
  end
end

xf = [0] ; f!(xf)
xg = [0] ; g!(xg)
xh = [0] ; y = h(xh)
xm = [0] ; m!(xm)
@test xf == xg == y == xm

print("push!  :"); xf = [0]; @btime f!(copy($xf))
print("append!:"); xg = [0]; @btime g!(copy($xg))
print("new vec:"); xh = [0]; @btime h(copy($xh))
print("resize :"); xm = [0]; @btime m!(copy($xm))

rdeits · September 25, 2020, 4:18pm

You also need to set evals=1 (see: BenchmarkTools setup isn't run between each iteration? - #4 by rdeits ) otherwise you will get unpredictable behavior due to re-use of the setup value between evaluations.

With that change, using resize is the fastest option, which is what you would expect:

julia> begin
              print("push!  :"); @btime f!(xf) setup = xf=[0] evals = 1
              print("append!:"); @btime g!(xf) setup = xf=[0] evals = 1
              print("new vec:"); @btime h(xf) setup = xf=[0] evals = 1
              print("resize :"); @btime k!(xf) setup = xf=[0] evals = 1
              end
push!  :  520.000 ns (6 allocations: 2.13 KiB)
append!:  106.000 ns (2 allocations: 1.67 KiB)
new vec:  89.000 ns (1 allocation: 896 bytes)
resize :  75.000 ns (1 allocation: 816 bytes)

lmiq · September 25, 2020, 4:23pm

Great, updated.

Yet append is effectively slower than creating a new vector. Is that expected?

rdeits · September 25, 2020, 4:30pm

The difference here is pretty small, but I don’t think it’s fair to say that append is slower than creating a new vector because your “append” benchmark also creates a new vector. You are doing:

append!(x,Vector{Int64}(undef,100))

which first creates a new vector of length 100 and then appends that to x. That means you’re doing all of the work to create a new vector and all the work to resize x to hold that new vector. It’s not surprising that this is slower than only creating a new vector and not appending it to x.

Your results should be different if you were to instead append an existing vector to x.

lmiq · September 25, 2020, 4:36pm

good point

Juan · November 17, 2021, 3:54am

Why is append! faster than push! ?

Elrod · November 17, 2021, 6:06am

append! resizes the array just once.
Adding a sizehint! would help, but push! would still add non-inlined function calls into C-code on each call.

Juan · November 17, 2021, 11:12am

Then, any reason to use push! instead of append! ?

Append also has the advantage of being able to add several rows simultaneously.

lmiq · November 17, 2021, 11:43am

append! and push! do different things. append! appends (…) one array to another array with the same dimensions, i. e.

julia> x = [1,2]
2-element Vector{Int64}:
 1
 2

julia> append!(x,[3,4])
4-element Vector{Int64}:
 1
 2
 3
 4

while push! adds one element to an array, with the same type as the type of element of the array:

julia> x = [1,2]
2-element Vector{Int64}:
 1
 2

julia> push!(x,3)
3-element Vector{Int64}:
 1
 2
 3

for instance, you cannot do, above, push!(x,[3,4]) , because you cannot add an array to a vector of integers.

The original question here was only examining how these options behaved in the generation of a new array.

Topic		Replies	Views
Interesting micro-benchmark General Usage	8	701	May 24, 2019
Push! and insert! very slow? (julia 1.11) (EDIT: No; just one-time allocations.) Performance	21	376	November 12, 2024
Unexpected garbage collection when using append! Performance question , garbage-collection	20	617	November 4, 2022
Benchmarking a function with side effects General Usage benchmark , benchmarktools	2	791	October 17, 2020
Computational time of appending an array and mutating an array Performance	4	329	November 8, 2022

How to benchmark append!?

Related topics