Element-wise vector multiplication and fusing dot

Don’t do timing measurements in global scope. I would also strongly recommend the BenchmarkTools package, and using @benchmark myfunc(...) rather than @time myfunc(...), since @benchmark will run a timing loop for you and gather statistics.

In Julia 0.5 and older versions, the .* operator is defined, but it is just a function call and is not fusing. In particular, it does not fuse with the assignment in .=, so z .= x .* y is equivalent to:

tmp = x .* y  # allocate a new temporary array for elementwise x * y
z .= tmp      # write the tmp into z, equivalent to copy!(z, tmp)

If you want the fusing version of this operation in 0.5, you need z .= (*).(x,y)

With bar!(x, y, z) = z .= (*).(x, y), I get:

julia> @benchmark foo!($x, $y, $z)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     77
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  7.98 kb
  allocs estimate:  3
  minimum time:     795.00 ns (0.00% GC)
  median time:      1.23 μs (0.00% GC)
  mean time:        1.95 μs (25.90% GC)
  maximum time:     36.09 μs (67.13% GC)

julia> @benchmark bar!($x, $y, $z)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     413
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  32.00 bytes
  allocs estimate:  1
  minimum time:     241.00 ns (0.00% GC)
  median time:      248.00 ns (0.00% GC)
  mean time:        289.79 ns (0.58% GC)
  maximum time:     4.35 μs (92.72% GC)
2 Likes