Multiplying by booleans faster than if/else

Avoiding the branch allows the compiler to use SIMD intrinsics that process multiple elements of the input arrays at the same time. The conventional syntax is something like y = ifelse(B[i], x[i]^2, 0.0), which will evaluate both of the branches simultaneously. For straight-line code like this, it’s also worth throwing an @inbounds on the loop or trying LoopVectorization:

function f4(x::Array{T,1},B) where {T}
  n=length(x)
  y=zero(T)
  @avxt for i=1:n
      y = ifelse(B[i], x[i]^2, zero(T))
  end      
  return y
end
julia> @btime f4($x, $B)
  369.307 ns (0 allocations: 0 bytes)
0.0
1 Like