Reduce: init for non-empty collections

aplavin · June 11, 2020, 4:20am

I noticed that the docs for reduce say:

reduce(op, itr; [init])

...
  If provided, the initial value
  init must be a neutral element for op that will be returned for empty collections. It is unspecified
  whether init is used for non-empty collections.
...

  Examples
  ≡≡≡≡≡≡≡≡≡≡

  julia> reduce(*, [2; 3; 4])
  24
  
  julia> reduce(*, [2; 3; 4]; init=-1)
  -24

First it says that init must be a neutral element for op (i.e. 1 for integer multiplication), and it may or may not be used for non-empty collections. Then there is an example with init = -1, which is clearly not a neutral element, and this example relies on init being used for non-empty collections.
So, what is the right contract for reduce? Sometimes in my code I pass init that is not a neutral element, and expect it to always be used. Is this a wrong assumption?

ettersi · June 11, 2020, 6:56am

This is a bad example in the documentation.

I guess the situation is that currently and for integers, reduce boils down to something like

function reduce(op, itr, init)
    r = init
    for x in itr
       r = op(r,x)
    end
    return r
end

The point of the statement in the documentation is that you should not assume that this holds for other types or across different versions of Julia. So basically, don’t do what the example demonstrates or your code may break in unexpected ways.

I for one would vote in favour of replacing the example with

julia> reduce(*, [2; 3; 4])
24

julia> reduce(*, [])  # Invalid: empty collection and no init
ERROR: MethodError: no method matching one(::Type{Any})

julia> reduce(*, []; init = 1)
1

The slightly tricky thing with this example is that reduce is actually very good at inferring the neutral element. For example, both of the following calls work even though they violate the “contract” specified in the documentation.

julia> reduce(*, Int[])
1

julia> reduce(+, Int[])
0

Tamas_Papp · June 11, 2020, 6:56am

~~When init is provided, it will be used.~~ Correction: no, that is only for foldl/foldr.

If the collection is empty, it must be provided.

aplavin · June 11, 2020, 7:16am

Good to hear, now I know this behaviour can be relied upon!

ettersi · June 11, 2020, 8:02am

Is this guaranteed? I guess doing so would make a lot of sense, but then the documentation should be updated.

Tamas_Papp · June 11, 2020, 8:13am

I think I misspoke, the docs of reduce says that

It is unspecified whether init is used for non-empty collections.

I would recommend foldr / foldl instead for code that is not meant to be parallelized — it guarantees that init will be used.

aplavin · June 11, 2020, 5:10pm

Ok then, I will change reduce to foldl when using non-neutral init. But is there a reason why reduce may not use init for non-empty collections? It makes sense to use reduce for associative operations instead of having to remember which of foldl/foldr is faster; also usages of foldl in the code look like the author explicitly needs left associativity, while it may not be the case.

Tamas_Papp · June 12, 2020, 9:16am

My understanding is that reduce (which is really a thin wrapper for mapreduce) leaves these things deliberately unspecified, so that the implementation is not constrained. Eg

reduce(+, [a, b, c, d]; init = z)

may be evaluated as one of

(a + b) + (c + d)
(z + a + b) + (c + d)
((a + b) + c) + d)
(((a + b) + c) + z) + d)

by a conforming implementation. The first one allows eg parallelization. For this, it is required that init is the neutral element, ie

op(a, z) == op(z, a) == a

and that op is associative.

All of this is mentioned in the docstring (except, of course, for the design rationale).

aplavin · June 12, 2020, 2:58pm

I understand that reduce has this flexibility of operation reordering - clearly it gives more optimization flexibility. But it is less clear why init may not be used. For associative operations it is thus required to use operation(init, reduce(operation, array, init=unity)) instead of reduce(operation, array, init=init) when init is not neutral, or switch to foldl with potentially worse performance. Not a big issue, of course, but still.

Topic		Replies	Views
How is reduce function initialized? General Usage question	5	1262	December 18, 2017
Init for mutating op: reduce(op!, xs; oninit) Internals & Design	1	104	December 5, 2024
Reduce not type stable New to Julia	2	169	July 15, 2024
Error: Reducing over an empty collection is not allowed New to Julia	7	5402	April 13, 2021
Purpose of reduce(Array, Array) General Usage question , function	5	3696	May 12, 2020

Reduce: init for non-empty collections

Related topics