New iteration protocol- could be even simpler?

I’m implementing the new iteration protocol for some library code, and, first, I have to say that I like it. It seems to make my code clearer and shorter.

But I’m wondering whether further simplification is possible. First, why is IteratorEltype() even needed? It seems that Base already has Any as the backstop default for the eltype function.

Second, couldn’t Base simply backstop length via: length(::Any)=nothing, and then let iterator implementations override length as appropriate? Similarly for shape? This would eliminate the need for IteratorSize().

I’m not privy to all the details, but I think the basic idea here is to signal code whether it should bother with type inference. If your eltype is Any it’s a signal to other code that container types should have that eltype, for example Vector{Any}. However, if the iterator has EltypeUnknown this doesn’t necessarily mean that the eltype is Any, it just means that you don’t know what it is. Conceivably, there could be a situation in which you have an iterator with EltypeUnkown but after some inference it gets put into an e.g. Vector{Int} (or other container with concrete eltype), whereas something with eltype(itr) == Any would definitely get collected into a Vector{Any}.

Note that IteratorSize is not binary. If it were, the approach you’re suggesting might make sense, though personally I rather like that the default behavior is to throw a MethodError for length, which in my opinion is much clearer than whatever MethodError might arise from a nothing getting passed somewhere.

Yeah, it kind of blows my mind that is possible to get efficient iteration out of what is basically the iterate function alone through all of this type gymnastics. It has me very nervous about performance regressions…

I was at the blackboard at JuliaCon 2016 where this was fleshed out by a number of talented people such as Tim Holy, Keno, Andy Ferris, Jeff, and Stefan.
I do like it a lot, although it depends heavily on inference, I think that IteratorEltype() (which hadn’t been discussed back then) is to make life a lot easier for the compiler. I’ve run into a problems with the new iteration protocols a couple of times because inference didn’t get a good anser, until I added some type assertions. Extending IteratorEltype will be a lot cleaner / easier to let the compiler know what’s going on.

3 Likes

I think this is a reasonable suggestion (though I haven’t thought it through in detail), but frankly I’m a bit burned out on iterator protocol changes, so I’m gonna say this has to stay in place for 1.0. Luckily it’s not nearly as disruptive as the current change so we can easily consider doing something here in a later version.

As for IteratorEltype, there’s a useful distinction between (“This has eltype Any”, vs “I don’t know what the eltype is”. When the eltype is specified, is is sometimes more useful to preserve it (e.g. in collect)).

4 Likes