I have noticed that inside function closures, using a captured variable with a type value as a constructor is much slower than using a variable with a function value. Consider the following (silly) example:
function g(f, n)
s = zero(f(n))
foreach(1:n) do i
s += f(i)
end
s
end
Then
julia> @btime g(Int, 4);
408.735 ns (1 allocation: 16 bytes)
julia> @btime g(x -> Int(x), 4);
68.649 ns (1 allocation: 16 bytes)
This surprised me because constructors usually behave like functions.
Is this considered a bug, or is it just the way it is? In the latter case, would it make sense to mention it in the documentation section on the performance of captured variables?
Hereβs the output of JET.@report_opt
. The slower version has a more serious type instability. It seems that the lambda function x -> Int(x)
is treated as if it has a more concrete type than Int
.
julia> using JET
julia> include("testclosure.jl"); # defines `g(f, n)` as in the above post
julia> @report_opt g(Int, 4)
βββββ 3 possible errors found βββββ
β @ testclosure.jl:3 foreach(#3, 1 : n)
ββ @ abstractarray.jl:3073 f(x)
βββ @ testclosure.jl:4 %7(i)
βββ runtime dispatch detected: %7::DataType(i::Int64)::Any
βββββββββββββββββββββββββββββββββββββββββββββ
βββ @ testclosure.jl:4 %6 + %8
βββ runtime dispatch detected: (%6::Any + %8::Any)::Any
βββββββββββββββββββββββββββββββββββββββββββββ
β @ testclosure.jl:2 s = Core.Box()
β captured variable `s` detected
βββββββββββββββββββββββββββββββββββββββββββ
julia> @report_opt g(x -> Int(x), 4)
βββββ 2 possible errors found βββββ
β @ testclosure.jl:3 foreach(#3, 1 : n)
ββ @ abstractarray.jl:3073 f(x)
βββ @ testclosure.jl:4 %6 + i
βββ runtime dispatch detected: (%6::Any + i::Int64)::Any
βββββββββββββββββββββββββββββββββββββββββββββ
β @ testclosure.jl:2 s = Core.Box()
β captured variable `s` detected
βββββββββββββββββββββββββββββββββββββββββββ
2 Likes
Anytime you see something like this, get used to doing this:
julia> using Cthulhu
julia> @descend g(Int, 4)
As pointed out by @greatpet, youβll quickly see that your usage for foreach
results in a Core.Box
due to this issue.
EDIT: maybe this is what you meant by
would it make sense to mention it in the documentation section on the performance of captured variables?
I donβt think itβs that much of a special case, but if you think you can come up with something that adds to that discussion, try submitting a PR?
Once you fix that, you may (?) also need to force-specialize on f
, i.e., g(f::F, n) where F
to circumvent Juliaβs heuristics for not specializing on types-as-arguments.
1 Like