Is it?
julia> @benchmark no_dot(A,B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 61.900 μs … 6.048 ms ┊ GC (min … max): 0.00% … 93.50%
Time (median): 78.600 μs ┊ GC (median): 0.00%
Time (mean ± σ): 128.704 μs ± 181.175 μs ┊ GC (mean ± σ): 11.31% ± 9.80%
██▄▄▂▁ ▃▃▂▂▁ ▂
████████▇▇▇▇▆▆▅▆▆▆▆▅▇███████▇▇▆▆▅▅▅▃▄▄▃▁▃▁▁▁▁▃▃▃▁▃▅▅▄▆▅▆▆▅▆▆▅ █
61.9 μs Histogram: log(frequency) by time 1 ms <
Memory estimate: 781.84 KiB, allocs estimate: 21.
julia> @benchmark with_dot(A,B)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 62.500 μs … 5.696 ms ┊ GC (min … max): 0.00% … 93.94%
Time (median): 78.600 μs ┊ GC (median): 0.00%
Time (mean ± σ): 108.850 μs ± 146.224 μs ┊ GC (mean ± σ): 11.28% ± 9.42%
▇█▄▃▂▁ ▁▁ ▂
███████▇█▆▇▇▆▇▅▆▅▅▆▅▅▅▆▇██▇▇▅▆▅▅▅▁▄▅▅▅▄▄▄▁▄▁▄▃▄▃▃▄▅▅▆▅▅▅▅▆▅▅▅ █
62.5 μs Histogram: log(frequency) by time 900 μs <
Memory estimate: 781.84 KiB, allocs estimate: 21.
Shouldn’t be any significant difference, because A+B broadcasts internally.