Don’t do timing measurements in global scope. I would also strongly recommend the BenchmarkTools package, and using @benchmark myfunc(...)
rather than @time myfunc(...)
, since @benchmark
will run a timing loop for you and gather statistics.
In Julia 0.5 and older versions, the .*
operator is defined, but it is just a function call and is not fusing. In particular, it does not fuse with the assignment in .=
, so z .= x .* y
is equivalent to:
tmp = x .* y # allocate a new temporary array for elementwise x * y
z .= tmp # write the tmp into z, equivalent to copy!(z, tmp)
If you want the fusing version of this operation in 0.5, you need z .= (*).(x,y)
With bar!(x, y, z) = z .= (*).(x, y)
, I get:
julia> @benchmark foo!($x, $y, $z)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 77
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 7.98 kb
allocs estimate: 3
minimum time: 795.00 ns (0.00% GC)
median time: 1.23 μs (0.00% GC)
mean time: 1.95 μs (25.90% GC)
maximum time: 36.09 μs (67.13% GC)
julia> @benchmark bar!($x, $y, $z)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 413
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 32.00 bytes
allocs estimate: 1
minimum time: 241.00 ns (0.00% GC)
median time: 248.00 ns (0.00% GC)
mean time: 289.79 ns (0.58% GC)
maximum time: 4.35 μs (92.72% GC)