Pre-allocating array efficiency

miguelborrero · October 11, 2024, 8:59pm

Hi there,

my understanding was, in vague terms, that pre-allocating an array by specifying size and then populating the elements of that array was more efficient than defining an array without doing so. To test this I just benchmarked examples of these two methods:

function no_preallocation()
    mat = []
    for i in 1:1000
        push!(mat, rand(1000))
    end
end

with benchmark results:

and

function preallocation()
    mat = Array{Float64}(undef, 1000, 1000)
    for i in 1:1000
        mat[i, :] = rand(1000)
    end
end

with benchmark:

so it seems that my intuition was wrong.

Oscar_Smith · October 11, 2024, 9:05pm

The comparison should be to

function preallocation()
    mat = Array{Float64}(undef, 1000)
    for i in 1:1000
        mat[i] = rand(1000)
    end
end

Your comparison is copying the vectors.

stevengj · October 11, 2024, 9:38pm

Note also that you are comparing two totally different data structures (an 1d array of 1d arrays ≠ a 2d array). And by including rand(1000) calls in your benchmark, a big chunk of the timing is due to generating random numbers.

I think you mean Array{Vector{Float64}}(undef, 1000)

DNF · October 11, 2024, 10:33pm

Your first, non-pre-allocated benchmark, uses untyped collections: mat = []. Make sure to use a type annotation, like

mat = Vector{Float64}[]

Your second benchmark does not properly pre-allocate. A new vector is allocated on each iteration. Try something like

mat[:, i] .= rand.()

Warning: untested. But note the changed dimensionality, work along columns, not rows.

Or even better, do

mat .= rand.()  # or rand!(mat)

removing the loop.

Also, much of the time is used on the rand call itself.

Benny · October 11, 2024, 11:57pm

Helps to count the allocations in your code:

function no_preallocation()
    mat = []  # allocates Vector{Any}
    for i in 1:1000 # 1000 times
        # push! reallocates mat for some resizes
        # rand(1000) allocates 1000-element Vector{Float64}
        push!(mat, rand(1000))
    end
end
# at least 1001 allocations, your @benchmark shows 1006

function preallocation()
    mat = Array{Float64}(undef, 1000, 1000) # self-explanatory
    for i in 1:1000 # 1000 times
        mat[i, :] = rand(1000) # allocates 1000-element Vector{Float64}
    end
end
# at least 1001 allocations, your @benchmark shows 1002

So you didn’t significantly save allocations overall, you trimmed a few resizing allocations in push!. As others have said, your data structures and element types also differ, so even putting the similar allocation count aside, it is not a fair comparison for performance either.

Also worth mentioning that rand(1000) has gone from 1 allocation in v1.10.5 to 3 allocations in v1.11.0, albeit with less memory overall, due to the changes to Array’s implementation.

miguelborrero · October 12, 2024, 4:52pm

I see, I think the different data structures effect was causing the issue. Once I corrected for this the preallocation benchmark improved the alternative. I thought they where the same structure. Thanks a lot!

Topic		Replies	Views
Preallocating Vector of Vectors Performance memory-allocation , arrays	25	3071	October 7, 2021
Why does a vector with 10 times more elements takes 2x-5x less time to pre-allocate? Performance question	14	370	October 30, 2024
Temporary pre-allocated array within function is slower than non-preallocated? New to Julia array	3	420	September 15, 2021
Performance of resize! vs pre-allocating with zeros(...) Performance	9	1111	April 29, 2022
Different ways to initialize an array of specified type General Usage	10	1560	July 29, 2020

Pre-allocating array efficiency

Related topics