Inspired by this topic here: map vs loops vs broadcasts, I wanted to do my own tests on the speed of the functions. As it turns out the optimised for loop, map and vectorisation are all roughly in the same ballpark.
Now I wanted to play the same game with a predefined result array and found significant differences:
using BenchmarkTools function forloop(e, v) @simd for i in eachindex(v) @inbounds e[i] = 2*v[i]^2 + v[i] + 5 end end fmap(e, v) = map!(x -> 2x^2 + x + 5, e, v) fbcs(e, v) = @. e = 2*v^2 + v + 5 v = rand(10000) e = similar(v) @btime for i in 1:100 forloop(e, v) end @btime for i in 1:100 fmap(e, v) end @btime for i in 1:100 fbcs(e, v) end
julia> 336.145 μs (0 allocations: 0 bytes) 944.702 μs (0 allocations: 0 bytes) 340.421 μs (0 allocations: 0 bytes)
Am I using map!() correctly or is there a reason why it should be so slow compared to the other implementations?