How to passing vargs... to a function without allocation?

I would like to unpack a tuple parameter of a function, then pass it to another function. The naive implementation in the following causes allocations which becomes a bottleneck for my application.
Is there any way to remove such allocations?

julia> t=(1.0, 2.0)
(1.0, 2.0)

julia> function f(x, y)
           return x+y
       end
f (generic function with 1 method)

julia> function g(t)
           return f(t...)
       end
g (generic function with 1 method)

julia> g(t)
3.0

julia> @time g(t)
  0.000003 seconds (1 allocation: 16 bytes)
3.0

I am thinking about a macro. But I don’t know how to handle a user input tuple of different datatype and length.
Thank you for your help!

I know this is not as elegant, but what happens with:

g(t) = f(t[1], t[2])

If you actually have a variable number of elements you can try and use an NTuple and dispatch on the number of elements (I think).

You can experiment with @inbounds and @inline to see if that makes any performance difference.

What you are doing already is perfectly fine, and should not allocate. It is probably an artefact of benchmarking, since t is a non-const global.

BenchmarkTools.jl should give more reliable results.

But, as I said, your code is otherwise already how it should be. If you are seeing a bottleneck outside of the benchmarking itself, I suspect it is related to some part of the code you haven’t shared.

4 Likes

^exactly that. This is what happens when t is a local variable:

julia> let
         t = (1.0, 2.0)
         @time g(t)
       end
  0.000000 seconds
3.0

To be clear, the function g is still global but it is also implicitly const, which the compiler can also optimize. It’s non-const globals that require a bit of overhead (16 byte allocation is a good tell).

2 Likes

I think this is mainly a benchmark artefact. It is safer to use BenchmarkTools.jl.

The right syntax to perform the benchmark is:

julia> using BenchmarkTools

julia> @benchmark g($t)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  3.177 ns … 212.467 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     3.195 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   3.221 ns Β±   2.131 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

     β–ˆ                    ▁▁                                   
  β–„β–„β–ƒβ–ˆβ–β–β–„β–„β–‚β–„β–β–β–‚β–‚β–β–‚β–β–β–‚β–β–‚β–β–ˆβ–β–ˆβ–ˆβ–β–‚β–†β–β–„β–„β–β–β–ƒβ–‚β–‚β–β–‚β–β–‚β–β–β–‚β–β–β–‚β–β–β–β–β–β–‚β–β–β–‚β–‚β–β–‚ β–ƒ
  3.18 ns         Histogram: frequency by time        3.22 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

β†’ you can check that there is no allocation.

It is important to notice the β€˜$’ in β€˜@benchmark g($t)’. As BenchmarkTool.jl doc says:

If the expression you want to benchmark depends on external variables, you should use $ to β€œinterpolate” them into the benchmark expression to avoid the problems of benchmarking with globals. Essentially, any interpolated variable $x or expression $(...) is β€œpre-computed” before benchmarking begins:

julia> A = rand(3,3); 
julia> @btime inv($A); # we interpolate the global variable A with $A 1.191 ΞΌs (10 allocations: 2.31 KiB) julia> @btime inv($(rand(3,3))); # interpolation: the rand(3,3) call occurs before benchmarking 1.192 ΞΌs (10 allocations: 2.31 KiB) 
julia> @btime inv(rand(3,3)); # the rand(3,3) call is included in the benchmark time 1.295 ΞΌs (11 allocations: 2.47 KiB)