# Occasionally NaNs when using similar()

I am occasionally (but not always!) getting NaNs when using the similar command on a Float array.

Here is a MME:

``````using Random: seed!

function run_weird_test()
seed!(0) # this really shouldn't do anything... but here just in case
h = [-0.0096, 0.8534, 2.1379, -0.7959]
cnt = 0
for _ = 1:100000
dh = similar(h).*0
if any(isnan.(dh))
cnt += 1
end
end
@show cnt
end
run_weird_test()
``````

I typically get cnt between 10 and 30. So this is not very often, but is quite problematic when it does happen.

If I replace the dh line with

``````dh = zeros(eltype(h),size(h)).*0
``````

then everything works fine, so I think that is the solution.

However, I am still puzzled as to what similar is doing, and after quite a long time trying to find where the NaNs came from and narrowing it down to this, I figured I might as well ask in case someone knows.

Edit: Best current guess is that the random memory slots have NaN (or Inf) already stored there. Maybe that is a strong reason to not use similar in generalâ€¦

The elements of the array created with similar contains undefined Float64 numbers.
NaN is a Float64, so you just get by chance the Float64 representation of NaN in your similar array.
NaN*0 again is NaN.
You may create the similar array with:

``````zeros(Float64,size(h))
``````

EDIT: Ah yes, you found the solution by your own

5 Likes

Using it is fine, but you need to initialize the elements. It does just create a similar array, not similar element values (which doesnâ€™t even makes sense writing it)

Thanks for confirming thatâ€™s whatâ€™s happening! Yes, itâ€™s good to know when we need to initializeâ€¦

Always use ?:

``````help?> similar
search: similar

similar(array, [element_type=eltype(array)], [dims=size(array)])

Create an uninitialized mutable array with the given element type and size, based upon the given source array.
``````
4 Likes

Yes, this is just not the right way to create an array of zeros. `similar(h)` can contain `NaN`s, and it can also contain `Inf`, and both `Inf * 0` and `NaN * 0` equals `NaN`.

Furthermore, `similar(h) .* 0` allocates two arrays and performs multiplication on every element, which is expensive.

If you want to initialize arrays with a certain value, you can have a look at `zeros`, `ones`, and, probably most useful: `fill`. These just create arrays and put values in them, without doing arithmetic.

``````ones(N) .* 3.0
``````

you would write

``````fill(3.0, N)
``````
6 Likes

Or something like:

``````dh = zeros(eltype(h), size(h))
``````

To use whatâ€™s already â€śencodedâ€ť in `h`.

2 Likes

Or just

``````dh = zero(h)
``````

(notice `zero`, not `zeros`).

The nice thing about `zero` (singular) is that it preserves type

``````julia> d = Diagonal(ones(5))
5Ă—5 Diagonal{Float64,Array{Float64,1}}:
1.0   â‹…    â‹…    â‹…    â‹…
â‹…   1.0   â‹…    â‹…    â‹…
â‹…    â‹…   1.0   â‹…    â‹…
â‹…    â‹…    â‹…   1.0   â‹…
â‹…    â‹…    â‹…    â‹…   1.0

julia> zeros(eltype(d), size(d))
5Ă—5 Array{Float64,2}:
0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0

julia> zero(d)
5Ă—5 Diagonal{Float64,Array{Float64,1}}:
0.0   â‹…    â‹…    â‹…    â‹…
â‹…   0.0   â‹…    â‹…    â‹…
â‹…    â‹…   0.0   â‹…    â‹…
â‹…    â‹…    â‹…   0.0   â‹…
â‹…    â‹…    â‹…    â‹…   0.0
``````
4 Likes

I was about to write `zeros([f(h) for f in (eltype, size)]...)` to avoid multiple instances of `h` but your `zero(h)` is the winner

2 Likes

Good to know! Nicely enough, this also works for OffsetArrays (which is what I need for my application) and it maintains the offsets.

I guess that the reason you used `similar` was exactly to preserve type of both container and element type?

Because in that case, the `zeros` and `fill` suggestions were probably not sufficient. The alternatives then are, to my knowledge, either `zero` (singular), or

``````dh = similar(h)
dh .= val
``````

where `val` is whatever value you need, which could be a 0 or 3.2, or whatever.

Edit: Oops. Actually, `dh .= val` is not so great, because it tries to fill the array completely. Actually, `zero` is my best suggestion.

1 Like

Similarly, you can use the `one` (singular) function.

But Iâ€™m not sure what the best way is to create a structured array filled with a certain value. For example, is there a better way to do this:

`````` # the exact nature of the array `h` is not known when the next line is called
h = Diagonal(ones(10))
dh = one(h) .* 3.2  # <- is there a better way?
``````

?

1 Like

In fact the floating point standard was specifically designed so that instead of multiplying by 0.0 (which gets confused by NaN and Inf) and promotes integer types to float types, you can multiply by false. So

``````h .* false
``````

should have the same result as `zero(h)` for most types.

3 Likes

Thanks, I did not know that.

Is that mandated by IEEE-754? Would you happen to have a reference to the part of the standard where this is mentioned?

Iâ€™m not sure if `one` always preserves type and if this is a missing method:

``````julia> dh = LowerTriangular(zeros(2,2))
2Ă—2 LowerTriangular{Float64,Array{Float64,2}}:
0.0   â‹…
0.0  0.0

julia> one(dh)
2Ă—2 Array{Float64,2}:
1.0  0.0
0.0  1.0
``````

While `one(T)` returns a value of type `T`, one might end up calling `one(x::AbstractArray)` if a specific method is not defined for that array type.

Sorry, I meant Juliaâ€™s standard, since IEEE-754 only defines the outcome when both operands are floating point numbers. Details of operations that create floating points are thus language specific. As per IEEE Std 754-2008 (Clause 7.1),

Language standards should specify defaults in the absence of any explicit user specification, governing:

• Whether any particular flag exists (in the sense of being testable by non-programmatic means such as debuggers) outside of scopes in which a program explicitly sets or tests that flag.
• When flags have scope greater than within an invoked function, whether and when an asynchronous event, such as raising or lowering it in another thread or signal handler, affects the flag tested within that invoked function.

So, in particular, the Julia language specifies that multiplication of an `AbstractFloat` by `false` produces a signed `0.0`. In particular the language itself specifies a new scope to which none of the floating point error flags are propagated.

Also notable, is that by IEEE Std 754-2008 (Clause 11), we only earn reproducible floating-point results with programs that

â€“ Do not use signaling NaNs.
â€“ <snipped>
â€“ Do not depend on quiet NaN propagation, payloads, or sign bits.
â€“ Do not depend on the underflow and inexact exceptions and flags.

So Juliaâ€™s specification is consistent with Reproducible floating-point results. Within a reproducible program, the legal forms of branching on `isnan` are quite limited.

3 Likes

Thanks for the detailed explanation!