Type instability of plus and the like

I’ve come across this particularity in basic functions. I believe this has to do with the flat evaluation of +(a,b,c...) instead of recursion. I would assume that the julia compiler would inline a recursion for a+b+...+z. Could someone please elaborate?

function do_plus{T<:Number}(a::T,b::T,c::T,d::T,e::T,f::T,g::T,h::T,i::T,j::T,k::T,l::T,m::T,n::T,o::T,p::T,q::T,r::T,s::T,t::T,u::T,v::T,w::T,x::T,y::T,z::T)
	return a+b+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z
end

function do_plus2{T<:Number}(a::T,b::T,c::T,d::T,e::T,f::T,g::T,h::T,i::T,j::T,k::T,l::T,m::T,n::T,o::T,p::T,q::T,r::T,s::T,t::T,u::T,v::T,w::T,x::T,y::T,z::T)
	return (((((((((((((((((((((((((a+b)+c)+d)+e)+f)+g)+h)+i)+j)+k)+l)+m)+n)+o)+p)+q)+r)+s)+t)+u)+v)+w)+x)+y)+z)
end

function test_plus()
	for i = 1:1_000_000
		do_plus(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26)
	end
	return nothing
end

function test_plus2()
	for i = 1:1_000_000
		do_plus2(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26)
	end
	return nothing
end
versioninfo()
# Julia Version 0.5.1
# Commit 6445c82 (2017-03-05 13:25 UTC)
# Platform Info:
#   OS: macOS (x86_64-apple-darwin13.4.0)
#   CPU: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
#   WORD_SIZE: 64
#   BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
#   LAPACK: libopenblas64_
#   LIBM: libopenlibm
#   LLVM: libLLVM-3.7.1 (ORCJIT, haswell)
test_plus()
test_plus2()

using BenchmarkTools

@benchmark test_plus()
# julia> @benchmark test_plus()
# BenchmarkTools.Trial:
#   memory estimate:  0 bytes
#   allocs estimate:  0
#   --------------
#   minimum time:     71.984 ms (0.00% GC)
#   median time:      73.310 ms (0.00% GC)
#   mean time:        74.043 ms (0.00% GC)
#   maximum time:     83.317 ms (0.00% GC)
#   --------------
#   samples:          68
#   evals/sample:     1


@benchmark test_plus2()
# julia> @benchmark test_plus2()
# BenchmarkTools.Trial:
#   memory estimate:  0 bytes
#   allocs estimate:  0
#   --------------
#   minimum time:     7.079 ms (0.00% GC)
#   median time:      7.252 ms (0.00% GC)
#   mean time:        7.395 ms (0.00% GC)
#   maximum time:     11.419 ms (0.00% GC)
#   --------------
#   samples:          676
#   evals/sample:     1



# Profile.clear()
# @profile test_plus()
# using ProfileView
# ProfileView.view(C=true)

do_plus

do_plus2

Unrelated to your question, but + is a variadic function, so you can do

julia> +(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26)
351

and is more efficient than your do_plus function.

Maybe you were already aware of this, maybe not, anyway it’s useful to point this out.

You are hitting a limit in the Julia compiler for dealing with a large number of arguments to functions. Try lower the number of arguments and you will see that the same code is generated.

3 Likes

See also https://github.com/JuliaLang/julia/pull/22545

1 Like

It is not true that this is more efficient. It’s just a less readable version of 1 + 2 + ... + 26.

Why +(...) is so much faster than do_plus(...)?

julia> @benchmark +(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     19.213 ns (0.00% GC)
  median time:      19.241 ns (0.00% GC)
  mean time:        19.326 ns (0.00% GC)
  maximum time:     46.339 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     997

julia> @benchmark do_plus(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     101.487 ns (0.00% GC)
  median time:      101.910 ns (0.00% GC)
  mean time:        102.614 ns (0.00% GC)
  maximum time:     164.708 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     943

@code_llvm shows that do_plus(...) performs much more operations than +(...)

Because, as mentioned in the first post, this is type unstable. You’ll find that if you defined do_plus as +(a, b, ..., n), it will have the same result.