Why is dot plus faster?

Is it?

julia> @benchmark no_dot(A,B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   61.900 μs …   6.048 ms  ┊ GC (min … max):  0.00% … 93.50%
 Time  (median):      78.600 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   128.704 μs ± 181.175 μs  ┊ GC (mean ± σ):  11.31% ±  9.80%

  ██▄▄▂▁               ▃▃▂▂▁                                    ▂
  ████████▇▇▇▇▆▆▅▆▆▆▆▅▇███████▇▇▆▆▅▅▅▃▄▄▃▁▃▁▁▁▁▃▃▃▁▃▅▅▄▆▅▆▆▅▆▆▅ █
  61.9 μs       Histogram: log(frequency) by time          1 ms <

 Memory estimate: 781.84 KiB, allocs estimate: 21.

julia> @benchmark with_dot(A,B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   62.500 μs …   5.696 ms  ┊ GC (min … max):  0.00% … 93.94%
 Time  (median):      78.600 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   108.850 μs ± 146.224 μs  ┊ GC (mean ± σ):  11.28% ±  9.42%

  ▇█▄▃▂▁                  ▁▁                                    ▂
  ███████▇█▆▇▇▆▇▅▆▅▅▆▅▅▅▆▇██▇▇▅▆▅▅▅▁▄▅▅▅▄▄▄▁▄▁▄▃▄▃▃▄▅▅▆▅▅▅▅▆▅▅▅ █
  62.5 μs       Histogram: log(frequency) by time        900 μs <

 Memory estimate: 781.84 KiB, allocs estimate: 21.

Shouldn’t be any significant difference, because A+B broadcasts internally.

2 Likes