Underperformant mapreduce

Conceptually, shouldn’t the mapreduce version be faster than the reduce version?

Granted, the difference is not much, but there are consistently some nanoseconds missing.

julia> N = 5115;

julia> a1 = rand(Bool, N);

julia> a2 = rand(Bool, N);

julia> b1 = BitArray(a1);

julia> b2 = BitArray(a2);

julia> @btime reduce(xor, xor.($b1, $b2))
  2.644 μs (2 allocations: 768 bytes)
true

julia> @btime mapreduce(xor, xor, $b1, $b2)
  2.656 μs (2 allocations: 768 bytes)
true

Bonus

Can anyone tell me why the Vector{Bool} version is:

  • For mapreduce, faster than any other
  • For reduce, slower than any other
julia> @btime mapreduce(xor, xor, $a1, $a2)
  1.922 μs (4 allocations: 5.22 KiB)
true

julia> @btime reduce(xor, xor.($a1, $a2))
  3.587 μs (3 allocations: 4.94 KiB)
true