Why is dot plus faster?

I’m experimenting with differnt expression and I can’t explain why the dot + is faster than just + in the simple case of matrix addition.

using BenchmarkTools 
A = rand(100, 100) 
B = rand(100, 100) 

function no_dot(A, B)
    return [A+B for i in 1:10]
end

function with_dot(A, B)
    return [A.+B for i in 1:10]
end

@btime no_dot(A,B)
@btime with_dot(A, B)

The second is 10% faster.

For me, i get the same performance and also i checked @code_native
both functions generates the same code…!!

2 Likes

Is it?

julia> @benchmark no_dot(A,B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   61.900 μs …   6.048 ms  ┊ GC (min … max):  0.00% … 93.50%
 Time  (median):      78.600 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   128.704 μs ± 181.175 μs  ┊ GC (mean ± σ):  11.31% ±  9.80%

  ██▄▄▂▁               ▃▃▂▂▁                                    ▂
  ████████▇▇▇▇▆▆▅▆▆▆▆▅▇███████▇▇▆▆▅▅▅▃▄▄▃▁▃▁▁▁▁▃▃▃▁▃▅▅▄▆▅▆▆▅▆▆▅ █
  61.9 μs       Histogram: log(frequency) by time          1 ms <

 Memory estimate: 781.84 KiB, allocs estimate: 21.

julia> @benchmark with_dot(A,B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   62.500 μs …   5.696 ms  ┊ GC (min … max):  0.00% … 93.94%
 Time  (median):      78.600 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   108.850 μs ± 146.224 μs  ┊ GC (mean ± σ):  11.28% ±  9.42%

  ▇█▄▃▂▁                  ▁▁                                    ▂
  ███████▇█▆▇▇▆▇▅▆▅▅▆▅▅▅▆▇██▇▇▅▆▅▅▅▁▄▅▅▅▄▄▄▁▄▁▄▃▄▃▃▄▅▅▆▅▅▅▅▆▅▅▅ █
  62.5 μs       Histogram: log(frequency) by time        900 μs <

 Memory estimate: 781.84 KiB, allocs estimate: 21.

Shouldn’t be any significant difference, because A+B broadcasts internally.

2 Likes

Thanks for the check, perhaps I should remove the post cause it was confusing even for me.

Just for you, I suggested (because it’s a tricky situation, 10% can easily be noise, loaded computer):

Ok, I didn’t for you specifically, but looking up issues under my name I didn’t find it at first, so was actually going to do it for you…, could have sworn I had suggested in the past, which I actually had. Not sure why the search failed the first time.

I did notice the interesting: “Feature suggestion: @ballocs” for some reason I read that as bollocks (would it read the same in English?).

2 Likes

https://juliaci.github.io/BenchmarkTools.jl/dev/manual/#TrialRatio-and-TrialJudgement

2 Likes

Ok, I think you’re saying already possible (just complicated). See link, but about implementing something better/user-friendly see the open issue. Let’s not discuss new implementation here, to not split the discussion. [Would be ok to post here how do do this exactly, with as-is tools.]