Yes, array is heap-allocated, but the problem is having arrays being heap allocated and garbage collected all the time in functions critical for performance, and loops. If you preallocate the arrays and just use them in the critical parts of the code, that is fine.
Playing a little. In the example below, the variable under study is the aux
array, which in one case will be a static array, and in the other just a standard array which is preallocated.
Using a static array, using a trick to allow the compiler to optimize for the size of the array to be generated, which is a variable of the function:
julia> using StaticArrays
function g(x,::Val{N}) where N
aux = zeros(SVector{N,Float64})
for i in 1:length(x)-N # critical loop
aux += @view(x[i:i+N-1])
end
return sum(aux)
end
g (generic function with 3 methods)
julia> @btime g($x,Val(10));
25.066 μs (0 allocations: 0 bytes)
julia> @btime g($x,Val(100));
74.327 μs (0 allocations: 0 bytes)
julia> @btime g($x,Val(1000));
848.970 μs (0 allocations: 0 bytes)
julia> @btime g($x,Val(5000));
5.077 ms (0 allocations: 0 bytes)
So far so good, no allocations happening anywhere, but you will note the huge compiler time taken for the largest values of N.
Now let us simplify things and just us an allocated aux
:
julia> function f(x,n)
aux = zeros(n)
for i in 1:length(x)-n # critical loop
aux .+= @view(x[i:i+n-1])
end
return sum(aux)
end
f (generic function with 2 methods)
julia> @btime f($x,10);
87.694 μs (1 allocation: 144 bytes)
julia> @btime f($x,100);
137.613 μs (1 allocation: 896 bytes)
julia> @btime f($x,1000);
956.585 μs (1 allocation: 7.94 KiB)
julia> @btime f($x,5000);
3.895 ms (2 allocations: 39.11 KiB)
There are allocations now (of course, because aux
is being allocated). There is an important performance difference for small sizes, but for larger sizes the performance of the heap allocated array can be better (in this example the tradeoff is above 1000, but that is not typical I think). There is no compilation overhead in this second case either.