Array{Float64}(undef, X) gives wrong results

JosephPollacco · September 20, 2019, 9:36am

I am writing a complex hydrological model in JUILIA.

https://github.com/joseph-pollacco/HyCatch-1D/tree/master/JULIA

Problem in file:
In file WaterBalance.jl

I was finding that when I run the model from the Julia REPL that rarely I get the correct results and other most of the times the results did not make sense (e.g. NaN in the computed Array).

I found that the problem was caused by initializing an empty array such as:

WaterBalance =Array{Float64}(undef, 50000)

I needed to perform a simple loop.

∑Evaporation= Array{Float64}(undef, 50000)
 function GLOBAL()
 for iT=2:N_iT
 …
  ∑Evaporation[iT] = ∑Evaporation[iT-1] + discret.ΔZ[N_iZevapo] * Qevaporation[iT] * ΔT[iT]

end

I required that

∑Evaporation [1] = 0.0

For reasons which I do not understand the solution I found was to initialize the Array by using zeros:

WaterBalance = zeros(Float64, 50000)

So my understanding is that there is a bug in Array.

Tamas_Papp · September 20, 2019, 9:42am

While that is theoretically possible, it is much more likely that there is a bug in your code — you fail to overwrite some elements, and an Array{T}(undef, ...) can have random memory contents.

Isolating an MWE would be helpful, try to simplify your problem so that you can post it here.

greg_plowman · September 20, 2019, 11:17am

Array{Float64}(undef, 50000) does not initialise the elements. That’s the undef part .

simeonschaub · September 21, 2019, 9:44am

Just put exactly this before your loop?

tamasgal · September 21, 2019, 10:44am

Exactly. Your routine starts with iT = 2 (from iT=2:N_iT) and then access the first element in ∑Evaporation[iT-1].

As @Tamas_Papp and @greg_plowman pointed out, undef will not initialise anything in your array, so you will get a random piece of memory.

In fact, you can see below that the probability to get a zero with an undef-Array with length 100000 on my machine is around 6% (YMMV and especially there are some memory management things going on which will make this not-so-predictable), so I guess that roughly reflects your observations:

I was finding that when I run the model from the Julia REPL that rarely I get the correct results and other most of the times the results did not make sense ( e.g. NaN in the computed Array ).

julia> sum(iszero.(Array{Float64}(undef, 100000))).  # in a fresh REPL session
6325

Note that executing this again indicates some kind of caching:

julia> sum(iszero.(Array{Float64}(undef, 100000)))
100000

julia> sum(iszero.(Array{Float64}(undef, 100000)))
100000

JosephPollacco · September 22, 2019, 11:45pm

Thanks for your answer. I did initialize before the loop.

∑Evaporation [1] = 0.0

But it did not sort out the problem. Sometimes the code run fine and other times no. The only solution was to initialize with zeros.

JosephPollacco · September 22, 2019, 11:49pm

Thanks for providing a valid explanations of what I observed. Therefore what are the advantages of using:

Array{Float64}(undef, 50000)

Compared to use the traditional:

WaterBalance = zeros(Float64, 50000)

Oscar_Smith · September 22, 2019, 11:53pm

The first is faster if you are going to start with some nonzero value.

tamasgal · September 23, 2019, 6:50am

If you create an array with undef “elements”, you basically instruct your computer to allocate that amount of memory without caring what’s inside it. If you use zeros however, you tell that you want to fill them with zeros after the allocation (you initialise it).

Initialisation to a specific value takes an extra amount of time and there are cases where this might be performance relevant.

It depends on your algorithm whether you explicitly need initial values or not. If you use that array to dump values in it and you are sure that by design you will never (read) access any elements of them which are uninitialised – which will lead to unexpected values/behaviour – you can go with undef and squeeze a bit more performance out of the computer. This however is clearly not true for you routine above.

Here are some measurements with BenchmarkTools:

julia> using BenchmarkTools

julia> @btime Array{Float64}(undef, 50000);
  1.318 μs (2 allocations: 390.70 KiB)

julia> @btime zeros(Float64, 50000);
  11.076 μs (2 allocations: 390.70 KiB)

JosephPollacco · September 23, 2019, 8:08am

Thanks Tamas, you beautifully explained the cons/pros of using undef.

Topic		Replies	Views
Undef for Array and SharedArray initialisation General Usage	3	660	September 5, 2018
Strange behavior using Plots / array initialization New to Julia	2	526	November 1, 2018
Weird behavior with 3-dimensional arrays (beginner) New to Julia	2	372	February 12, 2021
Help with Project Euler #2: undef inits, printing, multiplication by juxtaposition, and more New to Julia any	22	749	December 1, 2023
NaN and negative values in an undefined array / how to test for correct construction of a Type New to Julia	3	1310	May 10, 2019

Array{Float64}(undef, X) gives wrong results

Related topics