Is there a best practice for initializing arrays prior to for loops / does it matter? (I’ve seen a few similar posts but I don’t think there’s any that explain the differences, or when you might want to use one method or the other.)
I typically use zeros(dims...), which takes a bit longer than undef, but allows simpler control flow when doing, e.g., a[i] += <some calculation>. For simple calculations, I’ll use array comprehensions:
a = [<some calculation> for i in 1:10]
For more-complicated calculations, you can use map with a do-block:
a = map(1:10) do aᵢ
<some calculation>
end
Comprehension/map-ing has an additional advantage: it determines the output’s eltype and shape for you, which is great for some of Julia’s messier parametric types - who wants to type Array{MArray{Tuple{3}, Float64, 1, 3}}(undef, 10)?
When using fill you’ll go over the whole array in memory fill that with your value, which takes more time. It doesn’t matter for small arrays but for large ones it can be significant :
(not sure why the allocation numbers are the same?)
The downside of Vector is that your array will be filled with random numbers (whatever is in memory), so if you use these values by mistakes it can lead to hard to find bugs.
Personally I use Vector mainly when I have a vector of object more complicated that numbers that I don’t wand to initialize yet (e.g. Array{Matrix,1}(undef, 20))
One disadvantage of using fill is that the array is filled with the same object, compared to an comprehension which fills it with distinct objects:
julia> mutable struct example f::Int end
julia> a = fill(example(1), 3)
3-element Array{example,1}:
example(1)
example(1)
example(1)
julia> b = [ example(1) for _ in 1:3 ]
3-element Array{example,1}:
example(1)
example(1)
example(1)
julia> a[1] === a[2]
true
julia> b[1] === b[2]
false
julia> a[1].f = 3
3
julia> a
3-element Array{example,1}:
example(3)
example(3)
example(3)
julia> b[1].f = 3
3
julia> b
3-element Array{example,1}:
example(3)
example(1)
example(1)
This is because the expression in the comprehension is evaluated for each loop iteration, whereas the argument to fill is only evaluated once (before the call).
As for best practice - each version has its (dis)advantages, so whichever version suits the representation of your problem best. I usually go with comprehensions if the initialization is a long piece of code (which I then put into its own function). When all the function does is initializer an array, I usually either go with the undef version if the initialitization is a little bit more complicated and with zeros if I was going to start with zeroing the memory anyway.
The allocation numbers are the same because in both cases, only the backing memory is allocated. Each Float64 is 8 byte and 800_000_000 byte === 762.940 MiB.
Array comprehensions are generally the most elegant solutions in julia. Its easy to not think about them when we are thinking in for loops and sequential steps and things that we would have done to make it work in Fortran or C… but julia was made to work better with things like that.
Sometimes I forget, but thanks @stillyslalom for helping me remeber it!