abs2(Float64) takes longer than abs2(Complex64)?

performance

#1
function test1(u,N)
    for _ in 1:N
        abs2(u)
    end
end

u = ones(Complex64, 1000000)
v = ones(1000000)
@time test1(u, 1000)
@time test1(v, 1000)
julia> @time test1(u,1000)
  1.428069 seconds (4.00 k allocations: 3.725 GB, 5.69% gc time)

julia> @time test1(v,1000)
  2.382491 seconds (4.00 k allocations: 7.451 GB, 11.91% gc time)

Seems strange to me, why wouldn’t abs2(complex) take longer than abs2(float)? It’s more arithmetic work (and more memory access)


#2

I see a much closer timing then yours. Note that abs2(Complex64) (i.e. abs2(Complex{Float32})) is not necessarily slower than abs2(Float64) especially since the operation is cheap comparing to the memory access anyway. Replacing u with ones(Complex128, 1000000) does make it consistently slower than v for me.


#3
julia> abs2(ones(Complex64,2))
2-element Array{Float32,1}:
 1.0
 1.0

So the output array is half the size in the first case (float32 vs float64), which is what you see in the allocations line. Since this code is memory-bound, that explains the timing difference.