Should Generators finally be given eltype

Right; to some extent it’s ok to use the inferred type, as long as your code is fine with it being Any. It’s safe to use as a hint, like a size hint. For example I can imagine other implementations of Julia, or other execution environments, where return_type just always returns Any so that the type inference code doesn’t have to be included. But in my experience most people are not fine with Any; they have a more specific type in mind they’d like to get.

Inferred types do not have to get more specific over time, as the compiler improves, either. In fact at this very moment I am hoping to make many inferred types worse to speed up the compiler. Learn to love Any; it’s efficient to deduce! :slight_smile:

The OP has an interesting point: since we have IteratorEltype anyway, we could potentially return HasEltype for generators when, say, return_type is a concrete type. Of course, then people would have to be ok with IteratorEltype being unpredictable the same way return_type is.

7 Likes

That just completely destroy any non-breaking guarantee though… Especially given what people actually depends on in the past and how easy it is to depend on these behavior implicitly…

2 Likes

On the whole I agree. We don’t want to deliberately expand unpredictable behaviors like that.

1 Like

Would it be possible to have a function like return_type that would be much more computationally expensive but more predictable? Called maybe deduced_return_type.
Then add a non-breaking addition to the iteration interface of IteratorDeducedEltype HasDeducedEltype and deduced_eltype all which would by default fall back to the non-deduced versions when not explicitly define? which it would be for Generators.

That way there would at-least be a base way to do:

myfunction(iter::IT)=_myfunction(deduced_eltype(IT),iter)
function _myfunction(::Type{T},iter) where T <: MyType 
      ...
end

Computationally expensive isn’t even the issue. The issue is,

AFAICT, this is very non-trivial to define, especially if you want it to get anywhere near what the inference is currently capable of doing. Strictly speaking, it also means that any change to such a definition is a breaking change and practically at least any change that can cause a wider type to be returned is breaking.

This no regression requirement might seem easier to maintain and reason about, but note that the inference is coupled with optimization and different transformations in optimization is known to interfere with (i.e. hide information from) each other, meaning improvements may in some cases cause regression as well. Again, certainly won’t happen for most users for most cases but with enough users someone will definitely hit such case…

I’m not sure if this is true since I’ve never learn much(any?) formal theories about it so sorry if any part of this is inaccurate, but IMHO one of the main issues is that the inference is currently defined/constraint by a set of valid rules and transformations. This is enough to guarantee correctness and as long as we implement enough of these rules it’ll be good enough for performance too. However, to get predictable result following this route seems to not only require defining a fixed set of rules that can be applied but also in which order/manner they are applied (any change to either can lead to observable regression as mentioned above). In another word, following what we are doing, we can only define type inference based on an implementation and I believe such a definition is usually not very useful, (e.g. for the purpose of deciding whether we can make a change to it without being breaking).

1 Like

I think the correct long-run idiomatic solution would be exposing the building blocks of what collect does, so that users can easily write code that assumes a certain type (as @jeff.bezanson said, like a size hint, possibly deduced from the first element obtained) yet seamlessly go to a different branch if that assumption is violated. When the compiler knows a lot, if can prune these branches and emit very efficient code, but even the worst case heuristic fallbacks (widen as necessary) are not that bad.

I have used one component of this in the example code here, and I am working on a strategy that does something similar for “pre-allocated” buffers for output types, but that needs quite a bit of polish.

5 Likes

But Base.return_type is the building block? It should be the main function needed for that. I don’t feel like the exact code that is written around it to implement collect and other function is particularly reusable. The pattern/idea certainly is useful/reusable of course.

Or in another word, I feel like if you can formulate your problem into collecting an iterator, you can just use collect. Otherwise, you’ll probably need to make some non-trivial tweak to the code pattern in collect and I’m not sure how easy it is to split out more reusable part. Of course, this was my feel of it when I looked at it a while ago and I can easily be wrong on this since I don’t have a proof.

Just yesterday I had code that worked fine for several months

v::Vector{Union{Float64, Missing}}
t = f.(v)
t .= ifelse.(ismissing.(t), NaN, t)

crash because v was all-missing on some new data, so t::Vector{Missing}, which cannot store NaN. Thankfully, that didn’t happen in production, but it’s a serious concern whether we’ve tested all cases.

I agree with @Tamas_Papp that we’re missing some building blocks for generic code. For instance, here’s a short version of the basic loop for generic Kalman filter code:

function kalman_filter(initial_state, observations, theta, N)
   filtered = []
   likelihoods = []
   state = initial_state
   for i in 1:N
       state, ll = kernel(state, observations[i], theta)
       push!(filtered,  state)
       push!(likelihoods, ll)
   end
   return filtered, likelihoods
end

How should I pick the eltype of filtered? I could naively take the type of initial_state, but then if I use ForwardDiff to do gradient descent on theta, the state and likelihoods will contain DualNumbers, so that does not work. What works:

  • Use Base.return_type. Oh, but maybe we need to call it several times until convergence of the type of state :frowning:
  • Use accumulate. Have the accumulator function return the tuple (state, ll), then unzip the sequences afterward.
  • Do one or two iterations of state = kernel(...) just to pick the eltype, and hope that observations doesn’t have any missings.
  • Do my own type widening.

I tried accumulate. The code was ugly, and performance was bad, but it’s a solution in theory. Type widening doesn’t sound like beautiful code either. As for Base.return_type, I had some bad experience using it in Julia 0.6, but if it’s officially The Julian Way, I could give it another try. Right now, we use option #3, and I’m not happy with it, but it generally works.

3 Likes

No, the building block is an interface for a collection that can

  1. either store a new value (eg <: eltype),
  2. or allocate a new collection that can.

The caller determines if widening is needed, and calls itself accordingly. Just review “take 2” here, it is really short.

1 Like

This. But we need to abstract it to nice building blocks.

1 Like

As Jeff said, the issue is even wider type and no unless we remove eltype completely, writing functions that depends on it isn’t bad code. There are countless reason one might want to rely on eltype. The fact that these questions come up is the proof…

Yes, I know. But I mean that’s really just using the building block already…

I think your code clearly show that the majority of the code is written already, AFAICT most of the code in there is actually dealing with iterator, i.e. the customized and non-reusable part, rather than dealing with widening. As you said, that part is really short. My point is basically that there’s not much point to simplify 3 lines of code to 1 when it’ll add extra restriction and is expected to be used in another 10 lines of code…

And I’m just saying that AFAICT the fact you can write the code within so few line proofs that the building block is there. I don’t believe there’s a strong need to add any code as building blocks. Documenting the correct pattern or giving more examples is certainly always welcome.

OK, indeed that’s annoying.

However in that particular case you’d probably better allocate a new vector since you know it won’t contain any missing values. Then performance will be higher down the line. And replace(t, missing=>NaN) or coalesce.(t, NaN) is quite compact to write.

I’ll note that this crash is in general impossible to avoid given the way the code is written. Your complaint is that the eltype is too narrow (instead of too wide, which is what almost every other complaint is) and you want to use inference because it gives you a wider type. However, there’s absolutely no way we can guarantee any particular wider type is returned so using inference here won’t actually fix your problem in all cases.

In another word, you are relying on the inference not able to figure out the actual runtime return type of f, which might be true now but that’s not something we can guarantee. We can only guarentee that the inferenced type is a super type of the runtime type, if the runtime type is only Missing for a run, it’s entirely allowed for the inference/optimization of that particular run to return Missing as the inferred type.

1 Like

I’m saying that the fewer branches to test, the better. f.(::Vector{Union{Missing, Float64}}) gives me four possible outcomes to worry about. When I don’t care about performance, I use purely functional, non-mutating code, and it’s a non-issue. But Julia is all about performance, so code like mine happens, and it’s a significant worry for real-time systems. @andyferris said the same:

As a programmer, this scares me somewhat because I might get subtly different program behavior (or program correctness) depending on my input data. My concerm is that I might perform tests with mixtures of Missing and Float64 input values but be caught out in rare cases in production when all (or none) of the inputs are missing.

I understand the appeal of the current rules, from a certain perspective. I don’t really have a solution to offer. I’m just giving my report from real-world usage. Maybe I just need to assume that I can’t rely on the type of f.(v) at all, and code accordingly.

1 Like

The interface I am thinking about is for collections, ie

container2 = store!_or_widen(container, elt)

defined for various types of container, complete with a empty_container(::Type{T}) and a finalize_container(container). When container2 !=== container, branch to recursion.

Eg the obvious one is Vector{T}, created by Base.Bottom[] (which I think is what should be returned for empty collections into vectors), and finalized as itself.

You are right that the application of this interface is fairly trivial, I was thinking of standardizing the container part. I am experimenting with this in FunctionalTables.jl, but it is heavily WIP.

1 Like

Yes. Without any context or without a well defined inference rule, we just can’t give any guarantee on the return type of f.(v) other than the current behavior. I’ll even say that in some sense a well understood dependency on the input is better to account for or debug than something that only happens when the inference decide to give you a bad day…

In this case, the right way to give it the context is to manually give it the type since you know what you are expecting (you have code that want to throw in a NaN of type Float64 after all).

Right. Having some wrappers around the basic building block is certainly fine, especially in a package. (I wouldn’t personally call those building block any more since they’ll make more assumption than the “building blocks” in Base). My warning would be that such an interface might encourage people to match that interface when they shouldn’t, e.g. for different array types or even container type, like Dict/Set. This warning would not apply as much anymore once you’ve got a more complete collection of utilities written for different situations/container types and such a collection is certainly useful (at that point I’ll almost certainly not call it building blocks anymore but that’s not important…)

I can only figure out what three of the four branches you are talking about are.

But for what it is worth I use the signature AbstractArray{Union{Float64,T}} where T<:Missing to match two cases but not the third:

Vector{Float64}                   <: AbstractArray{Union{Float64,T}} where T<:Missing   == true 
Vector{Union{Float64,Missing}}    <: AbstractArray{Union{Float64,T}} where T<:Missing   == true
Vector{Missing}                   <: AbstractArray{Union{Float64,T}} where T<:Missing   == false

I’ve been wondering how hard it would be to implement 0.4-style comprehensions and broadcasting, as a package-provided macro. Something like @stable f.(x), and @stable (f(x) for x in v), that returns a StableGenerator{Base.return_type(...)}, so it has an eltype

BroadcastArray does that https://github.com/JuliaArrays/LazyArrays.jl

2 Likes

It would be nice if we can just write

function kalman_filter(initial_state, observations, theta, N)
   filtered = Union{}[]
   likelihoods = Union{}[]
   state = initial_state
   for i in 1:N
       state, ll = kernel(state, observations[i], theta)
       filtered = push!!(filtered,  state)
       likelihoods = push!!(likelihoods, ll)
   end
   return filtered, likelihoods
end

where push!! is a hypothetical push!-or-widen API. Is it crazy to imagine that compiler can do the “union-splitting” the for loop and lower it to a type stable code (when kernel is well behaved) in near future?

3 Likes