Writing a static function

I think that comment was about type inference, since it would need to be known e.g. what type (+)(::Int, ::Int) returns. If the variable’s type has been annotated anyway, then inference shouldn’t be needed I would think.

Interestingly, the fact that a variable is captured causes it to be boxed not only for the function that captures it, but also for the scope where it’s defined. Comparison of a) untyped capture, b) type-annotated capture, and c) typed Ref:

julia> using BenchmarkTools

julia> gena() = let a=0;      for i=1:1000; a+=i   end; ()->a+=1   end
       genb() = let a::Int=0; for i=1:1000; a+=i   end; ()->a+=1   end
       genc() = let a=Ref(0); for i=1:1000; a[]+=i end; ()->a[]+=1 end;

julia> @btime gena(); @btime genb(); @btime genc();
  22.900 μs (1459 allocations: 22.80 KiB)
  5.940 μs (970 allocations: 15.16 KiB)
  6.700 ns (1 allocation: 16 bytes)

julia> a, b, c = gena(), genb(), genc()
       a() == b() == c()
true

julia> @btime $a(); @btime $b(); @btime $c();
  27.614 ns (1 allocation: 16 bytes)
  11.735 ns (1 allocation: 16 bytes)
  4.805 ns (0 allocations: 0 bytes)

Note ns vs μs timings for the gen functions.

We can confirm our understanding by using Ref{Any} to imitate a Box and inserting type annotations to mimic the behavior of a type-annotated variable. Comparison of d) mimicking untyped capture, e) mimicking type-annotated capture, and f) typed Ref:

julia> gend() = let a=Ref{Any}(0); for i=1:1000; a[]+=i         end; ()->a[]+=1         end
       gene() = let a=Ref{Any}(0); for i=1:1000; a[]=a[]::Int+i end; ()->a[]=a[]::Int+1 end
       genf() = let a=Ref{Int}(0); for i=1:1000; a[]=a[]::Int+i end; ()->a[]=a[]::Int+1 end;

julia> @btime gend(); @btime gene(); @btime genf();
  23.200 μs (1459 allocations: 22.80 KiB)
  6.180 μs (970 allocations: 15.16 KiB)
  6.700 ns (1 allocation: 16 bytes)

julia> d, e, f = gend(), gene(), genf()
       d() == e() == f()
true

julia> @btime $d(); @btime $e(); @btime $f();
  28.313 ns (1 allocation: 16 bytes)
  11.211 ns (1 allocation: 16 bytes)
  4.600 ns (0 allocations: 0 bytes)

From this demonstration, it seems reasonably likely that making a type-parameterized Core.Box will allow for immediate performance improvements—at least where type annotation is used.

Of course it’d be better for type inference to work in the lowering stage so that we wouldn’t need to make type annotations, but I don’t know what the schedule for that is. Even then, it seems we’d want a type-parameterized Box anyway.

1 Like