 # Best Practice for Initializing an Array Prior to for Loop

Hello,

Is there a best practice for initializing arrays prior to for loops / does it matter? (I’ve seen a few similar posts but I don’t think there’s any that explain the differences, or when you might want to use one method or the other.)

E.g. is there a difference between, say:

a = fill(0.0, 10)

and

a = Vector{Float64}(undef, 10)

prior to, something like:

for i in eachindex(a)
a[i] = some calculation
end

Thanks,

Dave

I typically use `zeros(dims...)`, which takes a bit longer than `undef`, but allows simpler control flow when doing, e.g., `a[i] += <some calculation>`. For simple calculations, I’ll use array comprehensions:

``````a = [<some calculation> for i in 1:10]
``````

For more-complicated calculations, you can use `map` with a `do`-block:

``````a = map(1:10) do aᵢ
<some calculation>
end
``````

Comprehension/`map`-ing has an additional advantage: it determines the output’s eltype and shape for you, which is great for some of Julia’s messier parametric types - who wants to type `Array{MArray{Tuple{3}, Float64, 1, 3}}(undef, 10)`?

2 Likes

When using `fill` you’ll go over the whole array in memory fill that with your value, which takes more time. It doesn’t matter for small arrays but for large ones it can be significant :

``````julia> @time Vector{Float64}(undef, 100_000_000);
0.000017 seconds (2 allocations: 762.940 MiB)

julia> @time fill(0.0, 100_000_000);
0.229996 seconds (2 allocations: 762.940 MiB)
``````

(not sure why the allocation numbers are the same?)

The downside of `Vector` is that your array will be filled with random numbers (whatever is in memory), so if you use these values by mistakes it can lead to hard to find bugs.

Personally I use `Vector` mainly when I have a vector of object more complicated that numbers that I don’t wand to initialize yet (e.g. `Array{Matrix,1}(undef, 20)`)

2 Likes

One disadvantage of using `fill` is that the array is filled with the same object, compared to an comprehension which fills it with distinct objects:

``````julia> mutable struct example f::Int end

julia> a = fill(example(1), 3)
3-element Array{example,1}:
example(1)
example(1)
example(1)

julia> b = [ example(1) for _ in 1:3 ]
3-element Array{example,1}:
example(1)
example(1)
example(1)

julia> a === a
true

julia> b === b
false

julia> a.f = 3
3

julia> a
3-element Array{example,1}:
example(3)
example(3)
example(3)

julia> b.f = 3
3

julia> b
3-element Array{example,1}:
example(3)
example(1)
example(1)
``````

This is because the expression in the comprehension is evaluated for each loop iteration, whereas the argument to `fill` is only evaluated once (before the call).

As for best practice - each version has its (dis)advantages, so whichever version suits the representation of your problem best. I usually go with comprehensions if the initialization is a long piece of code (which I then put into its own function). When all the function does is initializer an array, I usually either go with the `undef` version if the initialitization is a little bit more complicated and with `zeros` if I was going to start with zeroing the memory anyway.

3 Likes

The allocation numbers are the same because in both cases, only the backing memory is allocated. Each Float64 is 8 byte and 800_000_000 byte === 762.940 MiB.

3 Likes

Thanks for the explanation.

I tried your code on my machine using julia 1.5.2, the following result is interesting.

Why initiating the array using undef is much slower than using Float64?

``````julia> @time Vector(undef,10^8);
0.366178 seconds (2 allocations: 762.940 MiB, 11.75% gc time)

julia> @time Vector{Float64}(undef,10^8);
0.072459 seconds (2 allocations: 762.940 MiB, 97.68% gc time)

julia> @time fill(0.0,10^8);
0.481662 seconds (2 allocations: 762.940 MiB, 14.65% gc time)
``````

this is an `Array{Any,1}`, with elements undefined.

1 Like