It seems like the compiler cannot infer the type of the tuple b in the below MWE, while I think it should. This results in suboptimal performance. I can easily work around this, at the expense of code readability. Worth reporting?
js(n) = ntuple(i->i, n)
foo(n, s) = exp.(im.*js(n).*s)
function bar(s::Real)
n = 5
b = foo(n, s)
sum(b)
end
s = 1.0
@code_warntype bar(s)
Variables:
#self#::#bar
s::Float64
n::Int64
b::ANY
Body:
begin # line 11:
b::ANY = $(Expr(:invoke, MethodInstance for foo(::Int64, ::Float64), :(Main.foo), 5, :(s))) # line 12:
return (Main.sum)(b::ANY)::ANY
end::ANY
Replacing n = 5 with n = Val{5} (or Val(5) on nightly or with Compat after https://github.com/JuliaLang/Compat.jl/pull/399) fixes the issue. I guess constant propagation isn’t good enough yet to infer the return type of ntuple(i -> i, n) with n = 5.
Maybe constant propagation isn’t the right term to use (I’m certainly not an expert on this), but I could imagine it being possible in the future that js, foo, and ntuple get inlined, and that after dead code elimination the compiler can figure out that the only possibility is that js(n) is an NTuple{5,Int64}, making everything type stable.
In some different cases, the compiler can already figure it out:
julia> f(x) = isbits(x) ? zero(x) : rand(Int, x[1])
f (generic function with 1 method)
julia> @code_warntype f(1)
Variables:
#self#::#f
x::Int64
Body:
begin
return 0
end::Int64
julia> @code_warntype f([4])
Variables:
#self#::#f
x::Array{Int64,1}
Body:
begin
goto 2
2:
SSAValue(0) = (Base.arrayref)(x::Array{Int64,1}, 1)::Int64
return $(Expr(:invoke, MethodInstance for rand!(::MersenneTwister, ::Array{Int64,1}), :(Base.Random.rand!), :(Base.Random.GLOBAL_RNG), :($(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Int64,1}, svec(Any, Int64), Array{Int64,1}, 0, SSAValue(0), 0)))))
end::Array{Int64,1}
Sure, and nothing like the original example is currently inferable, but is the dead code elimination scenario I was talking about such a stretch for some time in the future?
Yes, but it looks to me like dead code elimination currently comes after type inference is already done. This is totally reasonable of course, but there are cases, like the one in the OP, where (re-)running type inference after dead code elimination (at the Julia AST level, I suppose) could be beneficial, in the sense of potentially producing more performant code without the user or Julia Base developer having to jump through additional hoops (user having to use Val, addition of the Val methods for ntuple in base, in the OP case).
Here’s a simpler example:
function baz1()
n = 2
if n > 1
1
else
1.
end
end
baz2() = 1
Even though baz1 and baz2 always return the same underlying value, 1, baz1 returns it boxed, and the code_typed, code_llvm, and code_native are all different for baz1 and baz2. This is despite the fact that dead code elimination happens (I believe) for baz1 between the code_typed and the code_llvm stage.
That’s what I called it initially, but I wasn’t sure that it was the right term, so I tried to explain it in other terms. Thanks for the pointer to the issue.