Fastest way to check for Inf or NaN in an array?

Here are my best attempts. One is just #48462 (which should become standard in v1.10) and the other is based on bit twiddling

isfinite_v1_10(x::AbstractFloat) = !isnan(x - x)
isfinite_bittwiddle(x::T) where T<:Base.IEEEFloat = reinterpret(Unsigned,x) & Base.exponent_mask(T) != Base.exponent_mask(T)

using BenchmarkTools
x = randn(1000);
# x[345] = -Inf # enable if desired
@btime mapreduce(isfinite,&,x)
@btime mapreduce(isfinite_v1_10,&,x)
@btime mapreduce(isfinite_bittwiddle,&,x)
# all around 230ns on v1.8.0

@btime mapfoldl(isfinite,&,x)
@btime mapfoldl(isfinite_v1_10,&,x)
@btime mapfoldl(isfinite_bittwiddle,&,x)
# all around 125ns on v1.8.0

Note that mapfoldl was significantly faster for me (on v1.8) than mapreduce. mapreduce seems to sometimes do poorly on things (possibly #48129 but maybe different). None of the isfinite variants was any faster than another. My re-definitions of isfinite did not make things any faster.

And here’s my best hand-written (read: ugly) fma version. Hopefully LoopVectorization could do similar, but I didn’t want to mess with it.

function allfinite(x)
	unroll = 16
	V = Val(unroll)
	z = zero(eltype(x))
	# begin vectorized segment
	zz = ntuple(_->z, V)
	ss = zz
	startinds = eachindex(x)[begin:unroll:end-unroll+1]
	for i in startinds
		xx = ntuple(j -> @inbounds(x[i+j-1]), V)
		ss = fma.(zz, xx, ss)
	end
	ss == zz || return false
	# begin the tail
	s = z
	for i in startinds[end]+unroll:lastindex(x)
		s = fma(z, @inbounds(x[i]), s)
	end
	return s == z
end

@btime allfinite($x)
# 70-90ns, depending on the need to check the tail

This looks like the winner. I can’t imagine a way to do this any faster except by unrolling more (which has its drawbacks). The inner loop of the @code_native is just

.LBB0_4:                                # %L295
                                        # =>This Inner Loop Header: Depth=1
        vfmadd231pd     (%rax,%rdx,8), %ymm0, %ymm3 # ymm3 = (ymm0 * mem) + ymm3
        vfmadd231pd     32(%rax,%rdx,8), %ymm0, %ymm4 # ymm4 = (ymm0 * mem) + ymm4
        vfmadd231pd     64(%rax,%rdx,8), %ymm0, %ymm1 # ymm1 = (ymm0 * mem) + ymm1
        vfmadd231pd     96(%rax,%rdx,8), %ymm0, %ymm2 # ymm2 = (ymm0 * mem) + ymm2
        addq    $16, %rdx
        cmpq    %rdx, %rcx
        jne     .LBB0_4

If you don’t expect to find any !isfinite values then I wouldn’t bother checking for an early exit except maybe before the tail.

2 Likes