I would like to unpack a tuple parameter of a function, then pass it to another function. The naive implementation in the following causes allocations which becomes a bottleneck for my application.
Is there any way to remove such allocations?
julia> t=(1.0, 2.0)
julia> function f(x, y)
f (generic function with 1 method)
julia> function g(t)
g (generic function with 1 method)
julia> @time g(t)
0.000003 seconds (1 allocation: 16 bytes)
I am thinking about a macro. But I don’t know how to handle a user input tuple of different datatype and length.
Thank you for your help!
I know this is not as elegant, but what happens with:
g(t) = f(t, t)
If you actually have a variable number of elements you can try and use an
NTuple and dispatch on the number of elements (I think).
You can experiment with
@inline to see if that makes any performance difference.
What you are doing already is perfectly fine, and should not allocate. It is probably an artefact of benchmarking, since
t is a non-
BenchmarkTools.jl should give more reliable results.
But, as I said, your code is otherwise already how it should be. If you are seeing a bottleneck outside of the benchmarking itself, I suspect it is related to some part of the code you haven’t shared.
^exactly that. This is what happens when
t is a local variable:
t = (1.0, 2.0)
To be clear, the function
g is still global but it is also implicitly
const, which the compiler can also optimize. It’s non-
const globals that require a bit of overhead (16 byte allocation is a good tell).
I think this is mainly a benchmark artefact. It is safer to use BenchmarkTools.jl.
The right syntax to perform the benchmark is:
julia> using BenchmarkTools
julia> @benchmark g($t)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 3.177 ns … 212.467 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 3.195 ns ┊ GC (median): 0.00%
Time (mean ± σ): 3.221 ns ± 2.131 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
3.18 ns Histogram: frequency by time 3.22 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
→ you can check that there is no allocation.
It is important to notice the ‘$’ in ‘@benchmark g($t)’. As BenchmarkTool.jl doc says:
If the expression you want to benchmark depends on external variables, you should use
$ to “interpolate” them into the benchmark expression to avoid the problems of benchmarking with globals. Essentially, any interpolated variable
$x or expression
$(...) is “pre-computed” before benchmarking begins:
julia> A = rand(3,3);
julia> @btime inv($A); # we interpolate the global variable A with $A 1.191 μs (10 allocations: 2.31 KiB) julia> @btime inv($(rand(3,3))); # interpolation: the rand(3,3) call occurs before benchmarking 1.192 μs (10 allocations: 2.31 KiB)
julia> @btime inv(rand(3,3)); # the rand(3,3) call is included in the benchmark time 1.295 μs (11 allocations: 2.47 KiB)