What is the fastest way to check for Inf or NaN in an array? This has surprisingly become the bottleneck in one of my codes, and I’m looking for the fastest possible strategy
I noticed that running sum
on an array, then checking if the output is finite, is faster than doing any(isnan.(x))
on the array, which I found weird although I realize that sum
is extremely optimized and probably not any
. This is also because my arrays will contain very few NaNs/Infs, so any
will have to search through the entire array.
In fact, just using sum
instead of any
is actually more efficient on a very sparse array in general:
fast_any(x) = sum(x) > 0
x = zeros(Bool, 10000)
@btime any(x) # 3.120 μs
@btime fast_any(x) # 287.138 ns
A whole order of magnitude… (I note when the array is not sparse, any
is the clear winner. But for an array with very few true
, sum
is faster, which I find odd.)
Here are my current implementations of Inf/NaN checkers, with their speeds given below:
function simple_check(x)
any(isnan.(x)) || any(.!isfinite.(x))
end
function simd_check(x, n)
isnan_or_inf = Array{Bool, 1}(undef, n)
@inbounds @simd for j=1:n
isnan_or_inf[j] = isnan(x[j]) || !isfinite(x[j])
end
any(isnan_or_inf)
end
function sum_check(x)
s = sum(x)
isnan(s) || !isfinite(s)
end
The performances are (using x = randn(Float32, 1000)
- i.e., no non-finite values):
julia> @btime simple_check(x)
649.153 ns (6 allocations: 8.81 KiB)
julia> @btime simd_check(x, 1000)
558.973 ns (1 allocation: 1.06 KiB)
julia> @btime sum_check(x)
90.423 ns (0 allocations: 0 bytes)
Now, obviously I don’t actually need to compute the sum of this array, but actually using the sum
operation is significantly faster than the others! (and note that summing an Inf or NaN will result in another non-finite value).
Is there a faster way to check for NaNs and Infs? I note that my arrays seldom will actually have one, so the method should be optimized for this.
Edit #1: Updated times:
julia> @btime all(isfinite, x) setup=x=randn(Float32, 1000)
321.970 ns (0 allocations: 0 bytes)
julia> @btime isfinite(sum(x)) setup=x=randn(Float32, 1000)
80.913 ns (0 allocations: 0 bytes)
and, if overflows are an issue:
julia> @btime isfinite(sum(x .* 0)) setup=x=randn(Float32, 1000)
229.604 ns (1 allocation: 4.06 KiB)
(since 0 * Inf = NaN
)
so sum
is still faster…