Getting the last element of an iterator

Yes. Having programmed in Ruby, where doing so is very easy, I dread the possibility of needing to do it in Julia and not having the same ease (but probably should be possible to make a macro to do so, I just not know if there is something robust already in stock). However, depending on the function that you will pass the wrapped datastructure to, the number of methods that you need to delegate may be small.

A singly-linked list has O(n) implementation for any list[n], so why not just implement getindex, and make it an <:AbstractVector?

I don’t think that anyone is saying that random access methods should not be implemented when they make sense, just that iterate is the wrong interface for this, and stretching it is not the right solution here.

2 Likes

I think that’s what @Henrique_Becker is saying. If you implement getindex and make an AbstractVector, then you automatically get a working last. It will be O(n) and that is supposedly forbidden.

From what @apo383 says however it seems that I (and Henrique) misunderstood the documentation: it just says that last is implemented for the O(1) case (not that it shouldn’t be implemented for O(n) or whatever). That seems right: if O(n) for last was forbidden, it should also be forbidden for lastindex and getindex, but the documentation for those says nothing about complexity.

I’ll file a PR to implement one of @apo383’s propositions, that should also clarify what is meant in the documentation.

I am not sure where that is coming from, I can’t find it in the docs. My reading is that

function Base.getindex(v::SomeType, i::Int)
    sleep(i)
    ... # do something
end

is a conforming implementation. O(1) access is nice when you can get it, but not a requirement.

2 Likes

Here, Tamas. It seems that Base.last should only be implemented if it has O(1) access, but if someone inherit AbstractArray and implement lastindex as O(n) (because it does not require to be O(1) in its documentation), then immediately Base.last will work for the type but have the wrong complexity.

However, inheriting AbstractArray for LinkedList (the specific case in question) and implementing a lastindex for it (considering that in a single-linked list this will take n operations and will give a index which cannot be iterated in any way) seems to just be a case in which it is very clear the concrete object does not adhere to the respective abstract concept, and it is misleading to do this inheritance in this case.

1 Like

I’m a bit perplexed by the emphasis put on complexity guarantees here. I see that this is in the docs, but why? There are any number of functions that have highly variable complexities for different types. Look at the LinearAlgebra library for example, you could argue that sum should only be implemented for Diagonal matrices, because that has complexity O(N), while ordinary dense matrices have O(N^2), but that would be absurd.

I don’t see why last should not simply be expected to ‘return the last element of an iterator’, complexity be damned.

4 Likes
  1. sum documentation does not give a complexity guarantee. It is a different thing to defend the documentation of Base.last to not have O(1) as a requirement, and defend that this requirement is to be ignored before Julia 2.0, in which it can be changed.
  2. The expected complexity of sum is O(n-1 Base.+ operations for the eltype) where n is the number of elements in the container. sum(diagonal_mat) and sum(ordinary_dense_matrice) both follow such complexity. What is N in your example? The length of one dimension of the container? This does not seem standard.

I don’t see why last should not simply be expected to ‘return the last element of an iterator’, complexity be damned.

Well, I already give why I think so in this thread, and you are not addressing it. So I believe this is rhetorical.

Yeah, you’re saying “because the docs say so”, and that’s fine, but I’m asking “why is this a good idea?” I’m not even arguing that it must be changed here and now, just saying it’s weird to have this complexity guarantee for last. What makes last so special?

It shouldn’t be difficult to guess from context that I am talking about an NxN matrix, in which case Diagonal has O(N) complexity, and dense matrices have O(N^2).

No, diagonal_mat has N^2 elements; that there is an optimized implementation that compresses this to N is just that, an optimization.

I would expect last to have complexity O(N). O(1) in some cases is just an optimization, just like in a zillion other functions.

1 Like

Sorry, but no. Nobody defines the complexity of sum considering only square matrices and on base of the square of their common length between dimensions. This is a false equivalence. The worst-case complexity of sum is defined in base of the number of elements inside the container (this is the default meaning for n in O(n) for sum), and may be, in practice, much better if the object is sparse, but we are talking about worse-case complexity so it changes nothing.

Only whoever wrote the documentation can say the exact reasoning. I think it comes from an approach in which Base.last is a primitive, in the sense it is something the structure creator should implement with inner knowledge of its workings, if this cannot be achieved in O(1) then it may be achieved by the external user by applying other methods over the data structure. So it is guaranteed that, if it exists, it is a fast method which you could not implement yourself, if it does not, it is because you could implement it yourself.

I also don’t understand this focus on the complexity guarantee… It seems very peculiar to last, and generally doesn’t feel Julian to forbid implementation of generally useful methods like this.

@Henrique_Becker please consider the possibility that the documentation was poorly worded and misunderstood. Maybe as @apo383 suggests it was never the intention to forbid last in the O(n) case.

That would also explain the inconsistency with lastindex which says nothing about O(1). Consider again the LinkedList case. It’s not even necessary to inherit AbstractArray:

import Base: getindex, lastindex
struct LinkedList end
getindex(::LinkedList, ::Int) = ...
lastindex(::LinkedList) = ...

Reasonable code, no inheritance, and it gives a “slow” last.

Also look at this excerpt from HISTORY.md:

New function ncodeunits(s::AbstractString) gives the number of code units in a string. The generic definition is constant time but calls lastindex(s) which may be inefficient.

This all suggests that this weird constraint in the documentation of last was never meant as a constraint.

1 Like

Yes, I explicitly called it ‘absurd’.

My point is that I think it’s analogous, if not exactly equivalent.

Yes, this discussion as I understand it is exactly about whether this was a good idea or not.

But it makes it awkward to write generic code.

I am not sure that all the ramifications were considered when that was written, or that providing an occasional non-O(1) implementation of last when that’s the only possibility would be breaking the API in the sense the word is usually used.

That said, while use cases may exist, I think that the performance model of asking for

  1. sequential access (iterate), and
  2. a cheap last (but not random access in general)

is a niche application not worth defining a separate API, or modifying existing ones.

I would just deal with this on an ad hoc basis, using the solutions suggested in this topic.

1 Like

I have to admit I do not understand @apo383 point. The documentation was written for the function in its generic form, not for a method for a specific datatype, so @apo383 reasoning does not resonate with me.

I do not know if I agree that it is reasonable to implement lastindex in this case.

The excerpt seem to me as a suggestion that, in practice, package authors may not respect the guarantee, as it agrees that “The generic definition is constant time […]”.

In this point, I agree with Tamas:

I prefer that people trying to deal with this awkward model in a generic function just write awkward code. It is to minor evil for me.

“If it isn’t of personal interest to me, it should be awkward”, is that it? :roll_eyes:

1 Like

It’s a matter of interpretation, is the documentation like a Commandment “Thou shalt not have a slow last” or is it a placeholder guideline that could be revisited later?

The Iterator interface is simple and incredibly powerful. But also a bit untamed, (edit: fixed typo) because it’s not within the traditional type hierarchy, and instead depends on a bunch of traits, for which it’s not straightforward to understand the Venn diagram. Or the intent. Informal interfaces can be like the Wild West, to be cleaned up once civilization moves in.

Here’s reason to doubt the Commandment interpretation. The documentation says last "is accomplished by calling lastindex ". But the documentation for lastindex has no O(1) rule, nor does length. So it’s perfectly legal to implement an O(n) lastindex and then use a[end]! Problem solved, but why not provide a bit of convenience with last(a), which is literally just a one-liner for a[end]? (Note that end is just syntactic sugar for lastindex.) Maybe the last documentation could clarify that it’s a Commandment and mention this loophole. (Commandments always need loopholes)

There are lots of ways to resolve this. As I mentioned earlier, an appropriate method could be provided for IO and stay out of the whole iterator question. But maybe iterators can be re-examined, because there could be a lot of benefits to a slower but finite last. Perhaps there needs to be a trait HasSlowLast() or GuaranteedFastLast() or such. Or a new function possiblyslowlast() that can drop down to last if fast, but return in O(n) if finite?

I truly do not know the founders’ intent, nor the true ramifications of relaxing O(1). Except to say that it doesn’t seem like a breaking change to anything now, with future ill effects hard to foresee. I would definitely be in favor of a lastline(io::IO) or tail(io::IO, n) which seem eminently useful. I also think lots of users would benefit from some sort of last-like function for finite iterables.

1 Like

It is perfectly legal to implement an O(n) implementation of lastindex, but that has nothing to do with iteration. That’s implementing the Indexing interface. Iterators are useful, as a concept, precisely because they have such a minimal interface. They’re an abstraction of the things you can expect a for loop to work for.

1 Like

Thanks for making that point @malacroi. That explains why last should be fast even if lastindex has no such restriction. It therefore might be sensible to have a separate interface, say for finite iterators where last can take more time but the expectations are different from standard, lightweight iterators.

How do you get to the conclusion that last should be fast? If O(n) is fine for lastindex my conclusion is that last can be slow…

To address @malacroi’s point: lastindex shows that we can legally have a slow last, therefore we can in principle define last for iterators (using the minimal interface, not lastindex).

There’s a practical problem though in adding a generic last for iterators: I don’t see how to decide efficiently between calling a[end] (whenever possible) and calling the slow fallback.

There is no default trait for getindex/lastindex (no generic definition of Base.IndexStyle for example) and using applicable(lastindex, a) would incur an unacceptable performance penalty for something as basic as last

FWIW, since Julia does not have multiple supertypes (which is the right choice), putting generic APIs in the type hierarchy is not the right solution. If we only had a trait for things being iterable, that would be fine. Cf

One might also imagine a trait for determining if an iterable has a fast last, but again, it is a very niche application.