I think that comment was about type inference, since it would need to be known e.g. what type (+)(::Int, ::Int)
returns. If the variable’s type has been annotated anyway, then inference shouldn’t be needed I would think.
Interestingly, the fact that a variable is captured causes it to be boxed not only for the function that captures it, but also for the scope where it’s defined. Comparison of a) untyped capture, b) type-annotated capture, and c) typed Ref
:
julia> using BenchmarkTools
julia> gena() = let a=0; for i=1:1000; a+=i end; ()->a+=1 end
genb() = let a::Int=0; for i=1:1000; a+=i end; ()->a+=1 end
genc() = let a=Ref(0); for i=1:1000; a[]+=i end; ()->a[]+=1 end;
julia> @btime gena(); @btime genb(); @btime genc();
22.900 μs (1459 allocations: 22.80 KiB)
5.940 μs (970 allocations: 15.16 KiB)
6.700 ns (1 allocation: 16 bytes)
julia> a, b, c = gena(), genb(), genc()
a() == b() == c()
true
julia> @btime $a(); @btime $b(); @btime $c();
27.614 ns (1 allocation: 16 bytes)
11.735 ns (1 allocation: 16 bytes)
4.805 ns (0 allocations: 0 bytes)
Note ns
vs μs
timings for the gen
functions.
We can confirm our understanding by using Ref{Any}
to imitate a Box
and inserting type annotations to mimic the behavior of a type-annotated variable. Comparison of d) mimicking untyped capture, e) mimicking type-annotated capture, and f) typed Ref
:
julia> gend() = let a=Ref{Any}(0); for i=1:1000; a[]+=i end; ()->a[]+=1 end
gene() = let a=Ref{Any}(0); for i=1:1000; a[]=a[]::Int+i end; ()->a[]=a[]::Int+1 end
genf() = let a=Ref{Int}(0); for i=1:1000; a[]=a[]::Int+i end; ()->a[]=a[]::Int+1 end;
julia> @btime gend(); @btime gene(); @btime genf();
23.200 μs (1459 allocations: 22.80 KiB)
6.180 μs (970 allocations: 15.16 KiB)
6.700 ns (1 allocation: 16 bytes)
julia> d, e, f = gend(), gene(), genf()
d() == e() == f()
true
julia> @btime $d(); @btime $e(); @btime $f();
28.313 ns (1 allocation: 16 bytes)
11.211 ns (1 allocation: 16 bytes)
4.600 ns (0 allocations: 0 bytes)
From this demonstration, it seems reasonably likely that making a type-parameterized Core.Box
will allow for immediate performance improvements—at least where type annotation is used.
Of course it’d be better for type inference to work in the lowering stage so that we wouldn’t need to make type annotations, but I don’t know what the schedule for that is. Even then, it seems we’d want a type-parameterized Box
anyway.