Collect and type inference

While the type-widening of collect for EltypeUnknown is extremely useful for generic code, my recent experience with a large codebase with time-consuming and occasionally failing inference suggests that these two alternatives would be useful:

  1. a version of collect which assumes that all values from an iterator have the same type (practically, it would get one element, and use its type from then on), and it is free to error and abort otherwise,

  2. a version of collect which, when two values do not have the same concrete type, can just widen to Any and be done with it.

Parts of the machinery for this already exist in or could be adapted from base/array.jl, but are not part of the exposed API. I thought I would ask before opening an issue. Suggestions for the API are also welcome.

2 Likes

One issue is that calls to collect can happen implicitly, eg in array comprehensions. Maybe a macro that changes the behavior of the following expressions, like with @views and @.?

Are you sure that would help inference? If inference fails in the first place, it doesn’t even know the type of the first element… Likewise, I doubt that inference can be made faster by asserting that the result type will be concrete (if you don’t know it already).

Could you develop what are the problems you encountered? In this PR I’ve made steps towards making broadcast inferable even when the result has a Union eltype. It used to work in previous versions of Julia, but now it’s been broken by changed in inference. But there’s probably a way to finish this and extend it to map.

1 Like

I would want to help the compiler by giving up early instead of figuring out the exact Union etc when it is not likely to help me much. This is not the right semantics for Union{T,Missing} and similar, but it has its applications.

I am encountering some inference problems on 1.6 in a large codebase with deeply nested types. What happens with a result type Foo{T} is that [f(x) for x in X] figures out a result type Union{Vector{Foo},Vector{Foo{S}}} or similar for a concrete S, even if the type of X and f are inferred fine.

I could not make a sensible MWE yet.

Would something like this work in your case?

julia> function _eltype(iter)
           T = eltype(iter)
           return Base.isconcretetype(T) ? T : Any
       end
_eltype (generic function with 1 method)

julia> _collect(iter) = collect(_eltype(iter), iter)
_collect (generic function with 1 method)

julia> _collect(rand([1, missing]) for _ in 1:10)
10-element Vector{Any}:
 1
 1
 1
 1
 1
  missing
 1
  missing
  missing
 1

Edit: Sorry, you still need to define this to make it work with concretely typed generators:

function _eltype(gen::Base.Generator)
    T = Base.promote_op(gen.f, eltype(gen.iter))
    return Base.isconcretetype(T) ? T : Any
end

Not quite what you want, but maybe a workaround:

  1. collect(typeof(first(itr)), itr)
  2. collect(Any, itr)
1 Like