I’m broadcasting functions which return tuples, and have been converting arrays of tuples to tuples of arrays using “unzip”. I used for loops over each individual entry and re-assembled the output as a tuple.
using BenchmarkTools function foo(a,b,c,d) ab = exp(a*b) return ab+c, ab-d end N = 10000 a,b,c,d = [randn(N) for i = 1:4] X = (a,b) Y = (c,d) function foo_loop(N,a,b,c,d) for i = 1:N foo(a[i],b[i],c[i],d[i]) end end @btime foo_loop($N,$a,$b,$c,$d) function foo_loop_splat!(N,X,Y,X_entry,Y_entry) for i = 1:N for fld in eachindex(X) X_entry[fld] = X[fld][i] Y_entry[fld] = Y[fld][i] end foo(X_entry...,Y_entry...) end end X_entry, Y_entry = [zeros(length(X)) for i = 1:2] @btime foo_loop_splat!($N,$X,$Y,$X_entry,$Y_entry)
I noticed that the splatted for loop is much slower than the broadcasted “unzip” approach
102.436 μs (0 allocations: 0 bytes) 1.046 ms (50000 allocations: 937.50 KiB)
In comparison, using broadcasting and unzip
unzip(a) = map(x->getfield.(a, x), fieldnames(eltype(a))) @btime unzip(foo.($X...,$Y...)) @btime unzip(foo.($a,$b,$c,$d))
both runtimes are about the same
158.962 μs (8 allocations: 312.78 KiB) 156.331 μs (8 allocations: 312.78 KiB)
Why does this occur? I’m not clear on why splatting adds so much extra time. Is it runtime dispatch?