Can I type assert the eltype of an iterator?

question

#1

I have a complicated piece of code, with a fragment like:

prod(i for i in 1:10 if i > 10)

The product may or may not be empty, depending on circumstances. When it is empty, Julia gives an error:

ArgumentError: reducing over an empty collection is not allowed

If I have to check for the emptiness myself it will complicate my code significantly. I would like prod to return 1 instead. I think the problem here is that prod cannot infer the eltype of the iterator when it is empty.

Is there a way to deal with this?

Oh, this also does not work:

prod(i::Int for i in 1:10 if i > 10)

#3

Here is a workaround (that I’m not happy with). Define an intermediate function:

type_empty(itr, T) = isempty(itr) ? T[] : itr 

Then you can do:

prod(type_empty((i::Int for i in 1:10 if i > 10), Int))

Or,

prodt(itr, ::Type{T}) where {T} = isempty(itr) ? one(T) : prod(itr)

Is there a better way?


#4

If you don’t mind collecting first, you can do
prod([i for i=1:10 if i > 10])


#5

My iterator is big, so collecting is not an option.


#6

Now I am getting a different error when I combine iterators:

prodt((i for i in 1:10 if i > 10 for j in 1:10), Int)

ArgumentError: argument to Flatten must contain at least one iterator

Interestingly, reversing the order of the iterators does not give an error:

prodt((i for j in 1:10 for i in 1:10 if i > 10), Int)
# runs without problems

#7
julia> prodt((i for j in 1:10 for i in 1:10 if i > 10), Int)
1

julia> prodt((i for i in 1:10 if i > 10 for j in 1:10), Int)
ERROR: ArgumentError: argument to Flatten must contain at least one iterator
Stacktrace:
 [1] start(::Base.Iterators.Flatten{Base.Generator{Base.Iterators.Filter{##49#52,UnitRange{Int64}},##47#50}}) at ./iterators.jl:696
 [2] isempty(::Base.Iterators.Flatten{Base.Generator{Base.Iterators.Filter{##49#52,UnitRange{Int64}},##47#50}}) at ./essentials.jl:358
 [3] prodt(::Base.Iterators.Flatten{Base.Generator{Base.Iterators.Filter{##49#52,UnitRange{Int64}},##47#50}}, ::Type{Int64}) at ./REPL[1]:1

julia> prodt((i for i in 1:10 for j in 1:10 if i > 10), Int)
1

#8

Do you know why?


#9

It seems to me that since indeed the product series of an empty set is not defined, it shouldn’t bother you that you need to check whether the iterator is empty.

Why not simply:

isempty(iter) ? 1 : prod(iter)

where iter is your iterator?


#10

Same reason as this:

julia> collect(i for i = 0:-1 for j = 1:1)
ERROR: ArgumentError: argument to Flatten must contain at least one iterator
Stacktrace:
 [1] start(::Base.Iterators.Flatten{Base.Generator{UnitRange{Int64},##97#99}}) at ./iterators.jl:696
 [2] grow_to!(::Array{Int64,1}, ::Base.Iterators.Flatten{Base.Generator{UnitRange{Int64},##97#99}}) at ./array.jl:491
 [3] collect(::Base.Iterators.Flatten{Base.Generator{UnitRange{Int64},##97#99}}) at ./array.jl:397

julia> collect(i for i = 1:1 for j = 0:-1)
0-element Array{Int64,1}

Seems like you want your empty generator on the right, so the filter should be attached to the right-most generator if there’s a chance it will be empty. This is entirely based on experimentation though.


#11

I have raised an issue about this

The product of an empty set is usually taken as 1.


#12

I stand corrected.

While this limit is far from obvious to me, it seems well motivated (as far as I can tell \Pi(A\cup B) = \Pi(A)\Pi(B) is the best justification) so I’d be in favor of changing Base to define prod this way. I suppose to be consistent we’d then have to define sum(T[]) == zero(T) as well. (I suppose for Any it could default to Int.)


#13

It already is:

julia> sum(Int[])
0

julia> prod(Int[])
1

The problem is that with sum([]) you cannot infer the eltype, so you do not know whether to return 0 (int) or 0.0 (float) or what. It is safer to error.

My point is that sum or prod could take an extra type argument that enforces an eltype (and errors if it finds an element that does not fit the type), such that the empty iterator could default to zero(T) of the specified T, in cases where the eltype cannot be infered by the compiler.


#14

Oh, sorry, I completely missed what was going on here (and when I tested it I tried prod([]).

Now that I see what’s wrong I can give a less idiotic response:

Yes, it seems to me that since indeed the return types of these iterators can be inferred, Base should be changed so that the behavior of summing or taking the product series over iterators yields the same bahovior as doing so for arrays.

In your case I would try to ensure that your generators are properly typed, and then do

prod_proper(iter::Base.Generator{UnitRange{T},U}) where {T,U} = isempty(iter) ? one(T) : prod(iter)

though of course whether this is practical depends on how you are creating the generators. In general it’s probably a good practice to have type-stable generators regardless. It’s unfortunate that it’s impossible to overload the method in Base without having to re-write the algorithm.