Thank you very much @Benny for your insights.
Yet, there is something that goes against what I was used to see and to be suggested. The examples above seem to imply that the closure correctly infers the types of the arrays, even if they are not parameters of the closure.
This does not happen, for example, in this simple case, where the closure is written in global scope:
julia> solver(f, x) = f(x)
solver (generic function with 1 method)
julia> f(x,a,b) = (a' * b) * x
f (generic function with 1 method)
julia> g(x) = f(x,a,b) # the closure
g (generic function with 1 method)
julia> a = rand(3); b = rand(3); # non-constant global parameters
julia> @btime solver($g, 1.0) # this allocates
31.289 ns (2 allocations: 32 bytes)
0.3231943848334305
julia> function solver_barrier(f, x, a, b)
solver(x -> f(x,a,b), x)
end
solver_barrier (generic function with 1 method)
julia> @btime solver_barrier($f, 1.0, $a, $b) # this does not allocate
10.356 ns (0 allocations: 0 bytes)
0.3231943848334305
julia> @code_warntype solver(g, 1.0)
MethodInstance for solver(::typeof(g), ::Float64)
from solver(f, x) in Main at REPL[1]:1
Arguments
#self#::Core.Const(solver)
f::Core.Const(g)
x::Float64
Body::Any
1 ─ %1 = (f)(x)::Any
└── return %1
In our examples, arr1
and arr2
where intrinsically type-unstable inside the scope of the function where the closure was defined. So what is the difference here? Why the compiler does not do the same for this globally scoped closure with non-constant parameters?
I was with the impression that the closures where implemented somewhat like this:
julia> struct Closure{T1,T2}
a::T1
b::T2
end
julia> (c::Closure)(x) = (c.a' * c.b) * x
julia> const c = Closure(a,b)
Closure{Vector{Float64}, Vector{Float64}}([0.010136292594841279, 0.3915207261659518, 0.2766994600429509], [0.42788944246939575, 0.646503420278015, 0.2375779253782898])
julia> @btime solver($c, 1.0)
10.061 ns (0 allocations: 0 bytes)
0.3231943848334305
AFAIK, this can’t be the case, since that would imply a semantic difference relative to g(x) = f(x,a,b)
, since in this last case a
and b
can change, where in the callable struct it can’t. So how a closure is handled is actually dependent on the scope and context where it is defined.
Actually I understand now, I think, what was going on in the previous examples: Even if arr1
and arr2
are type-unstable, the compiler can guarantee that their types do not change while a particular instance the closure is being executed, and thus it can construct it in the form above for each set of types of arr1
and arr2
, while the compiler cannot have the same guarantees if the closures is defined in global scope.
At the same time, the fact that it does not infer s
seems to me just an inference bug, or failure, as the guarantees of its type are clearly deducible from the types of arr1
and arr2
.
In summary, I find the use of closures still quite tricky when critical performance is on the table, and my use of them are very restricted, particularly for cases like that of the solver_barrier
case, where we are sure that everything is strictly type stable in the scope where the closure is defined.