Best Practice for Initializing an Array Prior to for Loop

highflyer737 · October 16, 2020, 5:01pm

Hello,

Is there a best practice for initializing arrays prior to for loops / does it matter? (I’ve seen a few similar posts but I don’t think there’s any that explain the differences, or when you might want to use one method or the other.)

E.g. is there a difference between, say:

a = fill(0.0, 10)

and

a = Vector{Float64}(undef, 10)

prior to, something like:

for i in eachindex(a)
a[i] = some calculation
end

Thanks,

Dave

stillyslalom · October 16, 2020, 5:12pm

I typically use zeros(dims...), which takes a bit longer than undef, but allows simpler control flow when doing, e.g., a[i] += <some calculation>. For simple calculations, I’ll use array comprehensions:

a = [<some calculation> for i in 1:10]

For more-complicated calculations, you can use map with a do-block:

a = map(1:10) do aᵢ
    <some calculation>
end

Comprehension/map-ing has an additional advantage: it determines the output’s eltype and shape for you, which is great for some of Julia’s messier parametric types - who wants to type Array{MArray{Tuple{3}, Float64, 1, 3}}(undef, 10)?

jonathanBieler · October 16, 2020, 6:18pm

When using fill you’ll go over the whole array in memory fill that with your value, which takes more time. It doesn’t matter for small arrays but for large ones it can be significant :

julia> @time Vector{Float64}(undef, 100_000_000);
  0.000017 seconds (2 allocations: 762.940 MiB)

julia> @time fill(0.0, 100_000_000);
  0.229996 seconds (2 allocations: 762.940 MiB)

(not sure why the allocation numbers are the same?)

The downside of Vector is that your array will be filled with random numbers (whatever is in memory), so if you use these values by mistakes it can lead to hard to find bugs.

Personally I use Vector mainly when I have a vector of object more complicated that numbers that I don’t wand to initialize yet (e.g. Array{Matrix,1}(undef, 20))

Sukera · October 16, 2020, 8:08pm

One disadvantage of using fill is that the array is filled with the same object, compared to an comprehension which fills it with distinct objects:

julia> mutable struct example f::Int end

julia> a = fill(example(1), 3)
3-element Array{example,1}:
 example(1)
 example(1)
 example(1)

julia> b = [ example(1) for _ in 1:3 ]
3-element Array{example,1}:
 example(1)
 example(1)
 example(1)

julia> a[1] === a[2]
true

julia> b[1] === b[2]
false

julia> a[1].f = 3
3

julia> a
3-element Array{example,1}:
 example(3)
 example(3)
 example(3)

julia> b[1].f = 3
3

julia> b
3-element Array{example,1}:
 example(3)
 example(1)
 example(1)

This is because the expression in the comprehension is evaluated for each loop iteration, whereas the argument to fill is only evaluated once (before the call).

As for best practice - each version has its (dis)advantages, so whichever version suits the representation of your problem best. I usually go with comprehensions if the initialization is a long piece of code (which I then put into its own function). When all the function does is initializer an array, I usually either go with the undef version if the initialitization is a little bit more complicated and with zeros if I was going to start with zeroing the memory anyway.

Sukera · October 16, 2020, 8:12pm

The allocation numbers are the same because in both cases, only the backing memory is allocated. Each Float64 is 8 byte and 800_000_000 byte === 762.940 MiB.

aachener · October 18, 2020, 1:00am

Thanks for the explanation.

I tried your code on my machine using julia 1.5.2, the following result is interesting.

Why initiating the array using undef is much slower than using Float64?

julia> @time Vector(undef,10^8);
  0.366178 seconds (2 allocations: 762.940 MiB, 11.75% gc time)

julia> @time Vector{Float64}(undef,10^8);
  0.072459 seconds (2 allocations: 762.940 MiB, 97.68% gc time)

julia> @time fill(0.0,10^8);
  0.481662 seconds (2 allocations: 762.940 MiB, 14.65% gc time)

jling · October 18, 2020, 1:03am

this is an Array{Any,1}, with elements undefined.

BrunoVasco · April 16, 2021, 4:32pm

Array comprehensions are generally the most elegant solutions in julia. Its easy to not think about them when we are thinking in for loops and sequential steps and things that we would have done to make it work in Fortran or C… but julia was made to work better with things like that.
Sometimes I forget, but thanks @stillyslalom for helping me remeber it!

a = [<some calculation> for i in 1:10]

Thats a hell of an one-liner!

Topic		Replies	Views
Creating an array New to Julia	4	167	April 16, 2025
Create initialized arrays of structs General Usage	2	3809	November 30, 2017
Help me understand vector initialization Performance	7	582	January 18, 2023
Array{Float64}(undef, X) gives wrong results General Usage bug	9	1470	September 23, 2019
What about an `undefs` function in `Base`? General Usage feature-request	64	3150	January 24, 2023

Best Practice for Initializing an Array Prior to for Loop

Related topics