I tried writing a @thread’ed map function for array arguments. Very roughly imitating the general idea for how I see Base.map handles inferring the type of the result, I came up with,
function tmap(f,args::AbstractArray...)
cargs = collect(zip(args...))
n = length(cargs)
T = Core.Inference.return_type(f, Tuple{map(typeof,cargs[1])...})
ans = Vector{T}(n)
@threads for i=1:n
ans[i] = f(cargs[i]...)
end
ans
end
It works, but I was surprised to see that if I replace the type inference line above with just T=Any
, I get a ~20% speed improvement. Here’s what I benchmarked, just multiplying a bunch of matrices together,
m = [randn(128,128) for i=1:10]
@benchmark tmap(*,$m,$m)
I get 100ms for the inferred version, but 80ms without it.
The overhead of the call to Inference.return_type appears negligible. Any ideas what’s going on?