`prod` over empty collection with known eltype gives "reducing over empty collection ..."

I have the following code which I expected to return 1., but instead get “reducing over empty collection is not allowed error”:

julia> f(xs) = prod(a for (a, b) in xs)

julia> f(Tuple{Float64, Float64}[])
ERROR: ArgumentError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer
Stacktrace:
  [1] _empty_reduce_error()
    @ Base ./reduce.jl:319
...

The output of @code_warntype implies that the compiler was able to correctly infer the output type

julia> @code_warntype testfun(xs)
MethodInstance for testfun(::Vector{Tuple{Float64, Float64}})
  from testfun(xs) @ Main REPL[28]:1
Arguments
  #self#::Core.Const(Main.testfun)
  xs::Vector{Tuple{Float64, Float64}}
Locals
  #9::var"#9#10"
Body::Float64
1 ─ %1 = Main.prod::Core.Const(prod)
│   %2 = Main.:(var"#9#10")::Core.Const(var"#9#10")
│        (#9 = %new(%2))
│   %4 = #9::Core.Const(var"#9#10"())
│   %5 = Base.Generator(%4, xs)::Base.Generator{Vector{Tuple{Float64, Float64}}, var"#9#10"}
│   %6 = (%1)(%5)::Float64
└──      return %6

which made me curious why I am still getting the error messages. I thought that if the compiler can infer what types a generator will yield, then sum and prod would work even on empty collections.

It comes down to the fact that type inference, as an optimization, is not supposed to affect behavior. Even though there are, in the standard library, some exceptions to this rule which can’t be fixed due to backwards compatibility. E.g., map uses type inference to determine the eltype when given an empty iterator. As an example of moving away from this pattern in a package, see reconsider behavior of `collect_as` for empty input, regarding output element type and use of return type inference · Issue #72 · JuliaArrays/FixedSizeArrays.jl · GitHub.

It’s highly desirable for code to behave the same when, e.g., inference is turned off. Otherwise the behavior may depend on compiler version, enabled optimizations, etc., which can cause difficult-to-debug issues, and it makes code difficult to maintain. And use of inference at run time will inhibit some compiler optimizations, e.g., it disables constant-folding.

4 Likes

In that case: Why is

julia> prod(a for a in Float64[])
1.0

okay? It seems that it again depends on type inference correctly inferring that a will always be of type Float64 (even though of course, there is one less step to do for that inference in this case).

Also note that e.g. map seems to have less of a problem in this case:

julia> map(a -> sin(a[1]), Tuple{Float64,Float64}[])
Float64[]

julia> map(sin, a for (a, b) in Tuple{Float64,Float64}[])
Float64[]

(but reduce(*, a for (a, b) in Tuple{Float64, Float64}[]) does fail again.)

prod(a for a in Float64[]) works because eltype(::AbstractArray) is explicitly defind. This does not need to use inference.

Edit: Nevermind, I apparently can’t read: That’s a generator, not an array!

Compare:

julia> typeof(a for a in Float64[])
Base.Generator{Vector{Float64}, typeof(identity)}

julia> typeof(a for (a, b) in Tuple{Float64, Float64}[])
Base.Generator{Vector{Tuple{Float64, Float64}}, var"#5#6"}

As you can see, the simpler iterator is a generator parameterized by the named function identity, while the more complex one is parameterized by an anonymous function. Presumably identity is special-cased in the implementation of prod, or one of prod’s callees. So I don’t think there’s any runtime use of type inference here.

If you’re wondering why is identity even involved here, the answer is that (a for a in Float64[]) is, at an early stage, the “lowering”, transformed into something like Iterators.map(identity, Float64[]). Check yourself with Meta.@lower (a for a in Float64[]), although keep in mind the output mentions internals. So basically Julia recognizes a for a as the identity function, so it just uses identity, instead of a new anonymous function.

Yeah, already said as much in my previous message.

1 Like

In general, errors are not incorporated into the inferred return type because the errors stop the method before returning. In fact, they can help improve the inference by narrowing down what types are allowed to make it to a return statement.
Consider this simple example of poor type inference:

julia> foo(x::Ref{Any}) = x[]
foo (generic function with 1 method)

julia> @code_warntype foo(Ref{Any}(1))
MethodInstance for foo(::Base.RefValue{Any})
  from foo(x::Ref{Any}) @ Main REPL[18]:1
Arguments
  #self#::Core.Const(Main.foo)
  x::Base.RefValue{Any}
Body::Any
1 ─ %1 = Base.getindex(x)::Any
└──      return %1

Makes sense, the element type is ::Any so the compiler can’t narrow it down. But if you throw an error on all types except one via a typeassert:

julia> foo(x::Ref{Any}) = x[]::Int # convenient typeassert syntax
foo (generic function with 1 method)

julia> @code_warntype foo(Ref{Any}(1))
MethodInstance for foo(::Base.RefValue{Any})
  from foo(x::Ref{Any}) @ Main REPL[20]:1
Arguments
  #self#::Core.Const(Main.foo)
  x::Base.RefValue{Any}
Body::Int64
1 ─ %1 = Base.getindex(x)::Any
│   %2 = Main.Int::Core.Const(Int64)
│   %3 = Core.typeassert(%1, %2)::Int64
└──      return %3

Now the compiler can recognize the element has to be a certain type if it makes it out of this method. This is a very impractical minimal example of typeasserts, but it can be useful with runtime dispatches and type conversions. For example, type constructors and functions in general cannot promise they’ll return an instance of said type, even if every one of their methods do, so you’ll see redundant-looking patterns like T(#=insert inputs here=#)::T if the argument types aren’t known well enough at compile-time.

1 Like