I am occasionally (but not always!) getting NaNs when using the similar command on a Float array.
Here is a MME:
using Random: seed!
function run_weird_test()
seed!(0) # this really shouldn't do anything... but here just in case
h = [-0.0096, 0.8534, 2.1379, -0.7959]
cnt = 0
for _ = 1:100000
dh = similar(h).*0
if any(isnan.(dh))
cnt += 1
end
end
@show cnt
end
run_weird_test()
I typically get cnt between 10 and 30. So this is not very often, but is quite problematic when it does happen.
If I replace the dh line with
dh = zeros(eltype(h),size(h)).*0
then everything works fine, so I think that is the solution.
However, I am still puzzled as to what similar is doing, and after quite a long time trying to find where the NaNs came from and narrowing it down to this, I figured I might as well ask in case someone knows.
Edit: Best current guess is that the random memory slots have NaN (or Inf) already stored there. Maybe that is a strong reason to not use similar in general…
The elements of the array created with similar contains undefined Float64 numbers.
NaN is a Float64, so you just get by chance the Float64 representation of NaN in your similar array.
NaN*0 again is NaN.
You may create the similar array with:
Using it is fine, but you need to initialize the elements. It does just create a similar array, not similar element values (which doesn’t even makes sense writing it)
help?> similar
search: similar
similar(array, [element_type=eltype(array)], [dims=size(array)])
Create an uninitialized mutable array with the given element type and size, based upon the given source array.
Yes, this is just not the right way to create an array of zeros. similar(h) can contain NaNs, and it can also contain Inf, and both Inf * 0 and NaN * 0 equals NaN.
Furthermore, similar(h) .* 0 allocates two arrays and performs multiplication on every element, which is expensive.
If you want to initialize arrays with a certain value, you can have a look at zeros, ones, and, probably most useful: fill. These just create arrays and put values in them, without doing arithmetic.
I guess that the reason you used similar was exactly to preserve type of both container and element type?
Because in that case, the zeros and fill suggestions were probably not sufficient. The alternatives then are, to my knowledge, either zero (singular), or
dh = similar(h)
dh .= val
where val is whatever value you need, which could be a 0 or 3.2, or whatever.
Edit: Oops. Actually, dh .= val is not so great, because it tries to fill the array completely. Actually, zero is my best suggestion.
In fact the floating point standard was specifically designed so that instead of multiplying by 0.0 (which gets confused by NaN and Inf) and promotes integer types to float types, you can multiply by false. So
h .* false
should have the same result as zero(h) for most types.
Sorry, I meant Julia’s standard, since IEEE-754 only defines the outcome when both operands are floating point numbers. Details of operations that create floating points are thus language specific. As per IEEE Std 754-2008 (Clause 7.1),
Language standards should specify defaults in the absence of any explicit user specification, governing:
Whether any particular flag exists (in the sense of being testable by non-programmatic means such as debuggers) outside of scopes in which a program explicitly sets or tests that flag.
When flags have scope greater than within an invoked function, whether and when an asynchronous event, such as raising or lowering it in another thread or signal handler, affects the flag tested within that invoked function.
So, in particular, the Julia language specifies that multiplication of an AbstractFloat by false produces a signed 0.0. In particular the language itself specifies a new scope to which none of the floating point error flags are propagated.
Also notable, is that by IEEE Std 754-2008 (Clause 11), we only earn reproducible floating-point results with programs that
– Do not use signaling NaNs.
– <snipped>
– Do not depend on quiet NaN propagation, payloads, or sign bits.
– Do not depend on the underflow and inexact exceptions and flags.
So Julia’s specification is consistent with Reproducible floating-point results. Within a reproducible program, the legal forms of branching on isnan are quite limited.
Thank you @GunnarFarneback and @Palli for the clarification on the rationale behind similar().
I want a function which takes a variable x, and outputs another array of the same size and same element types, but without any NaNs or Infs.
I guess I can’t justify a strong use-case for it, but really, it is the randomness of the presence of NaNs which bother me. Granted, it is my fault that I’m using poorly instantiated variables directly for computation/comparison (not intentionally of course), but I was stumped on how my code would run correctly two out of three times, and keep running indefinitely the third time. I’m far from an expert on Floats, and am unsure if my next statement makes sense, but I’d either like similar to always output all NaNs, or zero NaNs.
For now, I’ll stick to using zeros and fill functions for my use cases (which don’t care about the time taken for the instantiation of arrays).
Maybe it’s time to rename similar to dis_similar? Its docs should maybe mention NaN and Infs possible, it’s just that people should know that possibility for floats… It’s only surprising if you don’t know floats well and/or purpose of similar.
It’s already been explained here, it’s an optimization (to not actually fill memory, which is O(n) operation, unlike “instant” for similar).
I proposed a new module called Unsafe, to get rid of ALL operations of Julia that are unsafe, i.e. only have access to them after you do using Unsafe. Possibly similar belongs there… but it’s a very common pattern you see used in Julia code, and should not be used unless you know why, and you properly initialize somehow (all of the array, or at least, parts you later access).
When is this ever helpful? Note I thought you mean signed as in +0.0, but to clarify for others, you can also get:
julia> -NaN * false # so not good for initializing an array; also for -0.0 * false or -Inf * false
-0.0