Performance of zeros() vs. Array{T}()?

In the following function, I see 1.56X speedup when using `zeros()` function for one input instead of using `Array{T}()`, which seems odd to me. When I use `@btime` though, I get the same timings.

``````julia> function vander(v, x, N::Int)
M = length(x)
if N > 0
v[:,1] .= 1
end
if N > 1
for i = 2:N
v[:,i] = x
end
accumulate(v, v)
end
return v
end
vander (generic function with 1 method)

julia> function accumulate(input, output)
M, N = size(input)
for i = 2:N
for j = 1:M
output[j,i] *= input[j,i-1]
end
end
end
accumulate (generic function with 1 method)
``````

Now compare the timings of `f()` and `g()`:

``````julia> function f()
M, N = 10^8, 4
x = rand(M)
v = Array{Float64}(undef,M,N) # <-----
t = @elapsed vander(v, x, N)
end
f (generic function with 1 method)

julia> f()
1.2380960570000001

julia> function g()
M, N = 10^8, 4
x = rand(M)
v = zeros(M,N)  # <-----
t = @elapsed vander(v, x, N)
end
g (generic function with 1 method)

julia> g()
0.77949281
``````

The overall function timing is telling you the whole story here â€”

For large uninitialized arrays, the operating system will sometimes lie to you and give you back a pointers to some space but it wonâ€™t have actually done any of the dirty work of allocating it for you. Zeros pays that cost for you upon writing zero to every element. The uninitialized `Array` constructor can sometimes defer that cost to the first time you write to it (or even to each page). See, e.g.:

8 Likes

Many thanks, I was thinking about something like this.

Am I reading this wrong? It seems to me that the timing indicated that the code was faster when
the array `v` was created with `zeros`. Only the function `vander` was timed. So why was it faster?
`vander` initializes `v`. Why would it matter how `v` was initialized (or not) before it was passed to `vander` ?

EDIT: by the way I am getting a speed up of 2.5 for using an array initialized to 0.0 instead of not initialized.

``````Julia Version 0.7.0
Commit a4cb80f3ed (2018-08-08 06:46 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-6650U CPU @ 2.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
``````

Sorry, I just got it. @mbauman is saying that the time is spent either up-front (initialized array), or later (uninitialized array). My timings of the entire f() or g() confirm that. Interestingâ€¦