I am asking this question in an effort to improve my understanding of computer science, not to complain about Juliaβs performance ![]()
I understand that, in principle, if we know that we are going to be filling an vector with exactly 1000 entries, it is better to simply declare it as an a 1000-element vector at the outset than to start with [] and push.
I also understand that if we know that our vector will have approximately/at least 1000 entries, but arenβt certain, then we can improve the performance of the push operations by using sizehint!(). In practice, it seems like this is mainly used for dictionaries, heaps, and other higher-level data structures rather than vectors of numbers.
My mental model of how sizehint!() works is as follows:
-
If I call
a = zeros(Float64, 1000), Julia allocates 1000 blocks of memory, fills them with zeros, and then βlabelsβ them with the variable nameaand a header that says βais a 1000-element vector whose entries are stored over here.β -
If I call
a = Float64[]; sizehint!(a, 1000), Julia allocates 1000 blocks of memory, builds a βfenceβ around them that says that other variables shouldnβt have their values stored here, and then βlabelsβ this with the variable nameaand a header that says βais a 0-element vector whose entries are stored over here.βNow if I call
push!(a, 0.0)1000 times, the memory needed for the push is already βright next door,β so without iterating over the whole ofa, Julia can simply increment its size and write the value of 0.5 to the end block in O(1)-time.
If my mental model is correct, then the function bar() below should be faster than the function foo(), because while both functions allocate m Float64-sized blocks, the first one needlessly fills them with zeros before overwriting them to be 0.5, whereas the second does not. The only additional computation required by bar() (in my mental model) is incrementing the size of a by 1 at each i, but this cannot be substantially more expensive than writing a bunch of zeros to the memory.
julia> function foo(m)
a = zeros(Float64, m)
for i in 1:m
a[i] = 0.5
end
end
foo (generic function with 1 method)
julia> function bar(m)
a = Float64[]
sizehint!(a, m)
for i in 1:m
push!(a, 0.5)
end
end
bar (generic function with 1 method)
But the opposite is the case: bar() is nearly 4.5 times faster.
julia> using BenchmarkTools
julia> @benchmark foo(50)
BenchmarkTools.Trial: 10000 samples with 972 evaluations.
Range (min β¦ max): 66.278 ns β¦ 567.468 ns β GC (min β¦ max): 0.00% β¦ 84.62%
Time (median): 83.435 ns β GC (median): 0.00%
Time (mean Β± Ο): 90.296 ns Β± 45.138 ns β GC (mean Β± Ο): 4.83% Β± 8.40%
ββββββ β
β
ββββββββββββββββββββββ
β
β
ββ
βββββββββββββββββββββββββββββββββ
β
66.3 ns Histogram: log(frequency) by time 471 ns <
Memory estimate: 496 bytes, allocs estimate: 1.
julia> @benchmark bar(50)
BenchmarkTools.Trial: 10000 samples with 264 evaluations.
Range (min β¦ max): 294.227 ns β¦ 5.683 ΞΌs β GC (min β¦ max): 0.00% β¦ 94.45%
Time (median): 300.496 ns β GC (median): 0.00%
Time (mean Β± Ο): 325.789 ns Β± 255.622 ns β GC (mean Β± Ο): 4.16% Β± 5.02%
βββββ ββ β
βββββββ
ββ
ββββββββββ
β
ββββββββ
βββ
β
ββ
ββ
ββ
ββββββ
ββ
βββββ
β
ββ
ββ
βββββ β
294 ns Histogram: log(frequency) by time 495 ns <
Memory estimate: 512 bytes, allocs estimate: 2.
Whatβs going on here?
(The results are similar if you write rand() instead of 0.5.)