I’m broadcasting functions which return tuples, and have been converting arrays of tuples to tuples of arrays using “unzip”. I used for loops over each individual entry and re-assembled the output as a tuple.
using BenchmarkTools
function foo(a,b,c,d)
ab = exp(a*b)
return ab+c, ab-d
end
N = 10000
a,b,c,d = [randn(N) for i = 1:4]
X = (a,b)
Y = (c,d)
function foo_loop(N,a,b,c,d)
for i = 1:N
foo(a[i],b[i],c[i],d[i])
end
end
@btime foo_loop($N,$a,$b,$c,$d)
function foo_loop_splat!(N,X,Y,X_entry,Y_entry)
for i = 1:N
for fld in eachindex(X)
X_entry[fld] = X[fld][i]
Y_entry[fld] = Y[fld][i]
end
foo(X_entry...,Y_entry...)
end
end
X_entry, Y_entry = [zeros(length(X)) for i = 1:2]
@btime foo_loop_splat!($N,$X,$Y,$X_entry,$Y_entry)
I noticed that the splatted for loop is much slower than the broadcasted “unzip” approach
102.436 μs (0 allocations: 0 bytes)
1.046 ms (50000 allocations: 937.50 KiB)
In comparison, using broadcasting and unzip
unzip(a) = map(x->getfield.(a, x), fieldnames(eltype(a)))
@btime unzip(foo.($X...,$Y...))
@btime unzip(foo.($a,$b,$c,$d))
both runtimes are about the same
158.962 μs (8 allocations: 312.78 KiB)
156.331 μs (8 allocations: 312.78 KiB)
Why does this occur? I’m not clear on why splatting adds so much extra time. Is it runtime dispatch?