Should reduce throw an error on infinite iterators?

Currently, reduce (and by extension mapreduce, sum and other functions) operate on iterators without checking whether they are finite or not. For example

julia> sum(Iterators.cycle([1,2,3]))

would run forever. I wanted to check if this intended behavior before submitting an issue on Github.

I was thinking that Base.IteratorSize can be used to check if an iterator is infinite before handing it over to mapreduce/reduce, e.g.,

julia> mysum(f,a::Iterators.Cycle;kw...) = Base.IteratorSize(a)==Base.IsInfinite() ? error( "Can not sum over an infinite iterator") : mapreduce(f,add_sum,a;kw...)
julia> mysum(a;kw...)=mysum(identity,a;kw...)

This seems to handle this issue, but I discovered that Base.IteratorSize(Iterators.cycle(Int64[])==Base.IsInfinite(), so another test is needed to maintain compatibility with the current

julia> sum(Iterators.cycle(Int64[])) # 0

At this point, it seems like I missing something because isempty(cycle([]))==true and IteratorSize(cycle([])==IsInfinite, which seems contradictory.

Edit: I found a closely related open issue about the case of empty iterator. However, I did not see a mention of the issue of reducing over infinite iterators.

1 Like

on principle it sounds fair to make reduce or sum reject inputs known to be IsInfinite. however I think there are two issues off the bat:

  • not all infinite iterators are IsInfinite; many are SizeUnknown. while I can see the argument that catching some bad cases is superior to catching zero bad cases, we can’t expect any kind of clean invariant like “reduce terminates” either way.
  • short-circuiting reductions any and all may terminate on infinite iterators if their condition is met. and in theory (though in practice this is not implemented) even something like sum could terminate if the reduction value becomes NaN or missing. so since these functions are generally implemented in terms of reduce it would introduce breaking behavior in the form of new errors, which would be a bit clunky to work around.
3 Likes

This sounds nice for reduce and making IsInfinite more accurate is good, but something in the rationale that makes me pause is sum isn’t always implemented by iterative reduce. Summing ranges only needs the first element, the step size, and length for a much more efficient formula, which is implemented in Julia. Infinite sums do exist and can be computed in finite steps, even though they’re obviously not suited for numbers with limited precision and are not implemented by any package I know of. But it wouldn’t be great for someone who eventually implements some infinite series’ natural indexing and iteration to run into the sum API demanding an error be thrown for any infinite series, even though the whole point was to give Inf for divergent series and compute known formulas for convergent ones.

2 Likes

I’m a little confused by some of the comments here. Yes, some reduce-like functions (e.g., any/all, some variants of sum) can short-circuit. But these functions are not built on actual reduce.

For a method that has means of termination beyond exhausting the iterator, we’ll have to let it play. But any reduce-like function with no means of early termination (such as generic reduce) should be able to throw on IsInfinite (possibly only if !isempty – see below).

EDIT It’s pointed out below that exceptions can be used for control flow which, combined with mutation, means there are cases where a program could be (poorly?) designed to reduce with an IsInfinite iterator until an exception. So it might be heavy-handed to prematurely terminate.


For cases like Iterators.cycle([]) there’s a larger issue: Does IsInfinite mean “inexhaustible always” or “inexhaustible if not empty”? We need to decide an answer to determine the correct behavior.

If “inexhaustible always,” then Iterators.cycle needs to either throw with an empty iterator, can’t be IsInfinite (can a SizeUnknown iterator be infinite? It seems to imply finite), or needs to have variable IteratorSize depending on its emptyness (seems bad for dispatch). If “inexhaustible if not empty,” then we need to check the entire codebase for IsInfinite handling that needs to cover the empty case.

There are existing bugs hinging on this decision:

julia> first(Iterators.cycle(Int[]), 5) # returns uninitialized memory
5-element Vector{Int64}:
                  49
 9223372036854775807
                  50
                   0
                  -1
3 Likes

This actually made me consider that we may not always want to. Sure, when we want an actual result, infinite iterators would never end, and the only way to “terminate” reduce is for the input function to throw an error and thus the intermediate value away. But as weird as it is, it’s possible to use reduce as an infinite loop that updates a global or captured variable or a mutable field, and throwing then catching the error can serve as a break to access the result (see below). reduce-based methods that fix the input function, like sum using the internal Base.add_sum which calls +, aren’t exactly exempt because those functions can have new methods that update globals. It’s also possible to not need the result within Julia. I’d be surprised if anybody wrote a program like this, which would be broken by making reduce throw on infinite iterators, but that’s not impossible.

julia> mutable struct PlusLimit{T}
         const limit::Int
         counter::Int
         result::T
       end

julia> function (p::PlusLimit{T})(x::T, y::T) where T
         if p.counter <= p.limit
           p.counter += 1
           p.result = x+y
           return p.result
         else
           error("Done!")
         end
       end

julia> p = PlusLimit(10, 1, 0) # 10 additions, so 11 values
PlusLimit{Int64}(10, 1, 0)

julia> try
         reduce(p, Iterators.cycle([1,2,3]))
       catch
         p.result
       end
21
3 Likes