Allow 3-argument sum in Julia?

Julia’s sum accepts a function as a first argument, like sum(abs, arr). This feature becomes very useful and efficient in many situations when we want to apply the function on the array before sum.
However, if the function has 2 arguments, sum doesn’t work because now it has 3 arguments. For example,

a = rand(Bool, 24)
b = rand(Bool, 24)
sum(xor, a, b)

fails with ERROR: MethodError: no method matching sum(::typeof(xor), ::Vector{Int64}, ::Vector{Int64}). In this case, we have to do something like:

sum(a[i] ⊻ b[i] for i in eachindex(a,b))
# or
mapreduce(t->xor(t...), +, zip(a,b))

How about allowing sum(xor, a, b) to mean sum(a[i] ⊻ b[i] for i in eachindex(a,b))? This seems very elegant and clean while efficient as well.

I acknowledge I am not addressing the question about sum, but why not:

julia> a = rand(Bool, 24)
julia> b = rand(Bool, 24)
julia> mapreduce(xor,+,a,b)
12

?

5 Likes

Yes, this is better but still more verbose than just sum(xor, a, b). I knew about this alternative but didn’t scan all possible variants in my post. It is still a good feature to allow it for sum, IMO.

Just define

sum(f, a, b)=mapreduce(f, +, a, b)
3 Likes

All of these alternatives are fine, I’m discussing this at the language level. If other people read my code (or even myself in the next month), they will get confused immediately.

?

sum(xor.(a,b))
1 Like

This is less efficient as it has to allocate memory for the result of xor before it can sum.

sum(Base.splat(xor), zip(a,b))

2 Likes

Just to add a couple more options to the mix:

sum(x -> xor(x[1], x[2]), zip(a, b))
sum(x -> xor(x...), zip(a, b))
sum(((x, y),) -> xor(x, y), zip(a, b))
#or 
sum(x ⊻ y for (x,y) in zip(a,b))

The disadvantage here is that it allocates an intermediate array

All in all I think sum is pretty flexible already, and I find mapreduce to be quite readable after surprisingly little getting used to… That said, this definition

is pretty much exactly what it would need to be, since sum(f, a) already forwards to mapreduce. Probably

sum(f, iters...; kw...) = mapreduce(f, add_sum, iters...; kw...)

The primary reason not to allow this is that the multi-argument version doesn’t mean the same thing at all as the single argument. sum(a) is the sum of the elements in a. Defining this method would naturally create sum(a, b) (since it would forward through sum(identity, a , b)), which is sum(a) + sum(b).
It’s weird to call that sum(a, b) I think…

1 Like

Exactly, I mean what prevents us from adding that one line to the definition of sum? For sum(a, b), it shouldn’t work because for 2+ arguments the first must be a function. Also, mapreduce(identity, +, a, b) doesn’t work, so it should be sum(+, a, b) by the modified definition, because mapreduce(+, +, a, b) works.

I think the definition of 2 argument sum is here julia/reducedim.jl at 74a6589a32f0917c02817847b42dee9d4569bfca · JuliaLang/julia · GitHub.

Yes, that’s all what it takes to allow the 3 arguments in the definition of sum. What prevents us from adding this line? I did some small experiments and it worked fine.

The reason against adding a method like that is not that it wouldn’t work, but it would make the language less readable for new users. When someone that doesn’t know Julia sees sum(a, b, c) they would expect the result to be a+b+c as mentioned before, since that is what the English word sum means.

Personally, I think that the method sum(f, x) = sum(f.(x)) is already a bit of a stretch of the meaning of the word sum. If you had been suggesting to change that into sum(x...)=sum(sum.(x)) such that sum(a, b) would mean sum(a)+sum(b), I would probably agree with you, although this would be hugely breaking and will never be implemented for that reason. In the hypothetical scenario that it would be, a new convenience function summap could be defined that does summap(f, x...) = sum(f.(x...)), which does what you want.

2 Likes

Did some quick benchmarking and got a bit surprised. I would have assumed that mapreduce was faster than sum + zip, but that seems to not be the case. Also surprised that the splat version is faster than the 2-arg version for the sum, the @code_native for them (the sum versions) was identical except for the line movabsq $140681441048008, %rax # imm = 0x7FF2F351F5C8 which differed a little in the numbers, though I assume that is just memory adresses differing since the functions are different?

julia> f1(f, a, b) = mapreduce(f, +, a, b)
f1 (generic function with 1 method)

julia> f2(f, a, b) = sum(x -> f(x[1], x[2]), zip(a, b))
f2 (generic function with 1 method)

julia> f3(f, args...) = sum(Base.splat(f), zip(args...))
f3 (generic function with 1 method)

julia> @btime f1(xor, a, b) setup=(a=rand(Int, 1000); b=rand(Int, 1000))
  1.658 μs (5 allocations: 8.05 KiB)
4300509624842529229

julia> @btime f2(xor, a, b) setup=(a=rand(Int, 1000); b=rand(Int, 1000))
  599.045 ns (0 allocations: 0 bytes)
3772453359478653442

julia> @btime f3(xor, a, b) setup=(a=rand(Int, 1000); b=rand(Int, 1000))
  537.342 ns (0 allocations: 0 bytes)
4852733446304964671
1 Like

You used rand(Int,1000) which sometimes overflows (negative output), instead of rand(Bool, 1000). The following timings are more representative:

julia> @btime f1(xor, a, b) setup=(a=rand(Bool, 1000); b=rand(Bool, 1000))
  163.472 ns (1 allocation: 1.06 KiB)
519

julia> @btime f2(xor, a, b) setup=(a=rand(Bool, 1000); b=rand(Bool, 1000))
  551.337 ns (0 allocations: 0 bytes)
524

julia> @btime f3(xor, a, b) setup=(a=rand(Bool, 1000); b=rand(Bool, 1000))
  552.406 ns (0 allocations: 0 bytes)
475
1 Like

One concern is that sum of vectors does pairwise summation. That’s apparently not the case for sum over a zip. But from a couple of quick tests (didn’t check the source code), it does work for mapreduce:

julia> a=rand(10000); b = rand(10000);

julia> sum(max.(a, b)) - sum(Base.splat(max), zip(a, b))
4.547473508864641e-12

julia> sum(max.(a, b)) - mapreduce(max, +, a, b)
0.0

mapreduce is known to be slow, I think there are several issues open on it.

Regarding three-arg sum, see Add function + two-argument method to reducers by baggepinnen · Pull Request #35017 · JuliaLang/julia · GitHub

Interesting that they behave that differently. Why would overflow affect them that disproportionally?

Why do you reckon that more representative? If you want to add the ability to sum over general functions and types, hopefully both cases should behave well for a good implementation of it?

It is visible for integers even when there is no overflow (still might be overflow checks?), e.g.

julia> @btime f1(xor, a, b) setup=(a=rand(1:10, 1000); b=rand(1:10, 1000))
  1.721 μs (5 allocations: 8.05 KiB)
6634

julia> @btime f2(xor, a, b) setup=(a=rand(1:10, 1000); b=rand(1:10, 1000))
  536.875 ns (0 allocations: 0 bytes)
6797

julia> @btime f3(xor, a, b) setup=(a=rand(1:10, 1000); b=rand(1:10, 1000))
  536.042 ns (0 allocations: 0 bytes)
6548

or when using max instead of xor it also looks the same to me.

I guess for some special cases mapreduce manages to be fast, but as @baggepinnen mentions, it looks like it can be slow in several other cases.