How does Vector{Int64}(undef, 20) differ in a multi-threaded environment?

I’m running a program that allocates an array of 20 integers, which I want to be initialized with zeros. So to start off with, I did this:

myfunc(Vector{Int64}(undef, 20))

This worked as the array would be filled with zeros. But when I did this, it allocated arrays with numbers and that would lead to overflows, even if the threadcount was 1:

Threads.@threaded for i = 1:Threads.nthreads()
  myfunc(Vector{Int64}(undef, 20))
end

I now know that I should use zeros(Int64, 20) instead of allocating the array directly. But just for my information, how does multithreading affect whether calling the Vector constructor allocates zeros vs allocating garbage?

It doesn’t, except that what happens to be in memory might be different. Array initialization with undef gives you whatever happens to be in memory, which could be zeros but isn’t guaranteed to be in either case. If you keep doing that in either situation you’ll get non-zeros at some point.

4 Likes

For example, I just did that on my machine with a single thread and got this:

julia> Vector{Int64}(undef, 20)
20-element Vector{Int64}:
  2
 21
 23
 25
 26
 40
 43
 44
 46
 48
 49
 54
 55
 60
 61
 80
 82
 84
 85
 86

Then I did it again and got this:

julia> v = Vector{Int64}(undef, 20)
20-element Vector{Int64}:
 5337514192
 4717709280
 4717709280
 4356005896
 4356005896
 4356005896
 4717709280
 4717709280
 5337857376
 5337857664
 5337856848
 4717709280
 4717709280
 4356005896
 4356005896
 4356005896
 4356005896
 4356005896
 4356005896
 4356005896

There happen to be large chunks of memory that are all zeros though, but it’s just luck.

6 Likes

To be clear, undef is an UndefInitializer():

julia> undef
UndefInitializer(): array initializer with undefined values

By creating an undef array, you are explicitly asking for an array filled with junk values, with whatever happens to be in memory. Sometimes, zeros may be what just happens to be there.
zeros is for if you want zeros.

julia> zeros(Int, 20)
20-element Vector{Int64}:
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0

I did something like the following long ago and found it instructive about how often you can expect to get all zero memory when using undef for Int64 arrays:

function sim(n = 10, n_reps = 1_000_000)
    all_zeros = 0
    sum_zeros = 0

    for _ in 1:n_reps
        x = Array{Int}(undef, n)
        count_zeros = 0
        for i in 1:n
            count_zeros += (x[i] == 0)
        end
        all_zeros += (count_zeros == n)
        sum_zeros += count_zeros
    end

    (all_zeros / n_reps, sum_zeros / (n_reps * n))
end

all_zeros_freq, elementwise_zero_freq = sim()

On my laptop, opening a REPL and then running that whole block (including redefining the function) many times gradually drops the frequency of all zero memory as more and more dirty memory gets reused. The first round produces something like (0.997094, 0.9973374), but I can get it down below 90% if I cheap copying and pasting the same code.