How do you actually find out *how* to use an interface in Julia, like the concept of iterator "end"?

So I am simply trying to find out how I identify that an iterator has come to the end of its content. The language is documentation standard doesn’t seem to be designed to help people understand what the contract of an interface is.

I come from a lot of use of Java amongst a bunch of other languages and I cannot stress enough how user unfriendly the documentation is.

I expect the documentation to explain all the landmark elements of an interface. So for example, in Java, if I look up the interface for an iterator it very clearly expand. Here’s the start. Here’s the end. Here’s how you query, whether or not the iterator is reach the end of its content, and these are all expressed as isclear statements and it takes a minute or two to understand what the interface does this is not the case with JL documentation.

I’ve spent like 20 minutes or a half an hour just trying to answer a question I know in every other language I’ve run into. I can get answered in about two minutes.

If I am using an interator on an array, how do I test that it’s reached the end?

Like if iterator.end() == true …?

But if somebody can explain to me what I’m supposed to do with the documentation system of the language in order to answer these questions you would be doing me a great favor

1 Like

If you would have defined the iteration condition like for i in 1:10 then i==10 will be the iteration end condition.

See that iterate(iter, state) == nothing is the condition for iteration end.
May be this post can be of help Understanding iterate() documentation is tough.

1 Like

thanks!

I will post here for others that usage is like this when you can’t use the convenience of built in language loop usage

it = iterate(x)

while it !== nothing
    i, state = it
    # some code
    it = iterate(x, state)
end

It’s, as raman_kumar notes, in the docs, under Interfaces->iterators, but note that iterators in julia typically are stateless. That is, the iterator itself does not “know” that it’s finished. There is usually no state which is updated in an iterator, so you can’t ask the iterator whether it has reached the end. The state is part of the iteration, not of the iterator. The most common way to use an iterator is in a for loop. You don’t get to see the iteration state in a for loop, only the elements, one by one. And there is no general way to tell if you’re in the last iteration. It depends on the iterator. Some iterators have a known length, whereas others don’t, and some go on indefinitely.

3 Likes

I want to point out, the structure of my complaint is that the documentation is there, but it is so organized as to be nearly useless or extremely frustrating to somebody like myself..

For example, if I go to C++ or to Java or any other language documentation, and they describe an interface like iterators, you see a clear, concise definition of what they are on a single page and what the contract for that interface is.

My experience with JL documentation is it not clearly presented and it’s so obscurely presented as to require an order of magnitude more effort to learn about many things that are basic.

This thread is a testimony to how frustrating it is. The documentation assumes people don’t care about the constituent parts and it assumes people will only use them in a loop or a while loop and the mechanics will be handled by the language.

5 Likes

I see your point. An inconvenience with julia, reflected in the documentation, is that very little is “part of the language”. The interfaces are more or less just conventions. Iterators are sort of an exception, in that the compiler/parser/lowering-engine actually rewrites for loops (literally) into while loops, so this particular interface is in a sense part of the language. But the other “interfaces” are merely conventions. Even conversion of 1 + 1.0 to Float64 is sort of a convention. It can be changed at will.

Every function and operator can be overloaded, and it’s common to do so, it’s even sort of how the entire language operates. (Try e.g. a methods(*) to see it). And it’s a kind of a silent agreement that you shouldn’t abuse things (e.g. so that a + b suddenly means "print a, save b to a file, and return b - a).

Of course, such things are also possible in other languages, but overloading/dispatch is typically not done as massively at the user level as in julia. I suspect it’s simply too much functionality to document properly.

In a sense, every abstract type, there are many of them, is a sort of interface, but little is documented about what an abstract type promises in terms of functionality:

help?> IO
search: IO

  No documentation found for public binding Core.IO.
...
help?> Number
search: Number Timer outer

  Number

  Abstract supertype for all number types.

Partly because it’s not necessarily well defined. It’s been introduced to collect some subtypes, without too much thought about what it really means in terms of functionality.

3 Likes

Does it not? To me this from the manual is quite clear:

Required method	              Brief description
iterate(iter)	              Returns either a tuple of the first item and initial state or nothing if empty
iterate(iter, state)	      Returns either a tuple of the next item and next state or nothing if no items remain

So if the iterator is exhausted then calling iterate(iter, state) returns nothing.
Could you perhaps elaborate why you find this unclear and perhaps we can try to improve/clear up the wording?

2 Likes

unfortunately, it is not just an issue of documentation. interfaces in Julia are generally underspecified. Contracts that these interfaces satisfy are generally defined more by convention, trial-and-error, and word of mouth, than they are defined by an explicit and exhaustive design doc like you would find in C++ or Java.

in the case of iteration specifically, I think really all that can be said is

  • iterate(x) must return a value and the next state (or nothing)
  • iterate(x, state) must return a value and the next state (or nothing)
  • when nothing is returned the iterator is done

but otherwise there are basically no universal rules about things like

  • is iterate(x, state) idempotent
    • answer: no e.g. for stateful iterators
  • does the number of iterate calls before nothing have to match length(x) ? and similarly does eltype(x) have to actually match the types of the values you get from iterate(x)
    • answer: it really should if you don’t want to break things (collect, map) in horrible ways, but the interface doesn’t explicitly demand this.
  • if x == y, is some state_x obtained from iterating x a valid iteration state for y ?
    • in practice, usually yes, but the interface doesn’t demand it so it theoretically might not be. but using an invalid state for an iterator can be UB
  • what the shape of a value and the next state should be. I was intentionally ambiguous with that wording since there is no actual requirement that it’s a Tuple. it just has to be destructurable into two parts like x, state = iterate(x)

that being said, I still think in practice this all ends up working out pretty well. I have found the iteration primitives & utilities to be powerful & basically sufficient for implementing any kind of iterator needed. so I am not trying to be negative — mostly just highlight that your frustration is very reasonable, but it is not caused so much by people “not caring about constituent parts” as it is by those parts being pretty amorphous in the first place.

11 Likes

Is there contradiction about loops conversion between above and below post?

1 Like

Not really. A for loop results in exactly the same lowered code as the equivalent while loop. I.e. the intermediate representations are identical except for some vaiable names.

Example llvm (click to expand)
julia> forloop(it) = for i in it; println(i); end
forloop (generic function with 1 method)

julia> whloop(it) = begin
           next = iterate(it)
           while next !== nothing
               (item, state) = next
               println(item)
               next = iterate(it, state)
           end
       end
whloop (generic function with 1 method)

julia> @code_llvm debuginfo=:none whloop(1:10)
; Function Signature: whloop(Base.UnitRange{Int64})
define void @julia_whloop_3129(ptr nocapture noundef nonnull readonly align 8 dereferenceable(16) %"it::UnitRange") #0 {
top:
  %"it::UnitRange.stop_ptr" = getelementptr inbounds i8, ptr %"it::UnitRange", i64 8
  %"it::UnitRange.stop_ptr.unbox" = load i64, ptr %"it::UnitRange.stop_ptr", align 8
  %"it::UnitRange.unbox" = load i64, ptr %"it::UnitRange", align 8
  %.not = icmp slt i64 %"it::UnitRange.stop_ptr.unbox", %"it::UnitRange.unbox"
  br i1 %.not, label %L28, label %L17

L17:                                              ; preds = %L17, %top
  %value_phi521 = phi i64 [ %1, %L17 ], [ %"it::UnitRange.unbox", %top ]
  call void @j_println_3131(i64 signext %value_phi521)
  %0 = icmp eq i64 %value_phi521, %"it::UnitRange.stop_ptr.unbox"
  %1 = add i64 %value_phi521, 1
  br i1 %0, label %L28, label %L17

L28:                                              ; preds = %L17, %top
  ret void
}

julia> @code_llvm debuginfo=:none forloop(1:10)
; Function Signature: forloop(Base.UnitRange{Int64})
define void @julia_forloop_3136(ptr nocapture noundef nonnull readonly align 8 dereferenceable(16) %"it::UnitRange") #0 {
top:
  %"it::UnitRange.stop_ptr" = getelementptr inbounds i8, ptr %"it::UnitRange", i64 8
  %"it::UnitRange.stop_ptr.unbox" = load i64, ptr %"it::UnitRange.stop_ptr", align 8
  %"it::UnitRange.unbox" = load i64, ptr %"it::UnitRange", align 8
  %.not.not = icmp slt i64 %"it::UnitRange.stop_ptr.unbox", %"it::UnitRange.unbox"
  br i1 %.not.not, label %L29, label %L14

L14:                                              ; preds = %L14, %top
  %value_phi4 = phi i64 [ %0, %L14 ], [ %"it::UnitRange.unbox", %top ]
  call void @j_println_3138(i64 signext %value_phi4)
  %.not.not20 = icmp eq i64 %value_phi4, %"it::UnitRange.stop_ptr.unbox"
  %0 = add i64 %value_phi4, 1
  br i1 %.not.not20, label %L29, label %L14

L29:                                              ; preds = %L14, %top
  ret void
}
4 Likes

If they are equivalent then why for loop is converted to while loop ?

1 Like

It’s just how a for loop is defined, as fully equivalent to the while loop in the docs about iterators (which you linked to above). Even though no textual while is inserted during the translation, it’s lowered to the same intermediate represenation. So, in this sense, a for loop is just syntactic sugar for a slightly more verbose while loop.

1 Like

In documentation 45 line is translated into: should be replaced with is equivalent to:for clarity otherwise it creates confusion. Should i make PR?

I suppose that’s an impementation detail. Whether it’s translated into a textual while loop, or translated into a functionally equivalent intermediate representation can’t really confuse anyone?

2 Likes

Absolutely agree on the documentation point. The culture of ad hoc, half-organized documentation in Julia is a major impediment to working with it. Look up the documentation for a function in Python or one of its major libraries like NumPy and you’ll find a rigorous, consistent format. In Julia it’s much more hit-or-miss. I do find that the latest LLMs can partially compensate.

I am still not clear what concrete actions to take in order to improve.

If I were looking at the documentation for the first time, I would look at the Table of Contents on the left side of the page (on mobile I have to click on the top left hamburger menu icon) and look for sections that mention iteration or interfaces:

There are two sections in the documentation’s table of contents:

The first page then discusses Indexing and Abstract Array interfaces.

The second page contains a similar example to your while loop:

next = iterate(iter)
while next !== nothing
    (i, state) = next
    # body
    next = iterate(iter, state)
end

If I look at the Java documentation, I start at this page and see nothing about data structures, collections, or iteration:

I might start reading the Language Specification and end up here:

There is some information here about interfaces in the abstract and for loop iteration, but still nothing about an iteration interface.

After getting frustarted about this abstract definition of the language, I might start searching the API and find the Iterator interface.

Eventually I might get confused that there is not an Array class and that if I want to iterate an array, I have to use an ArrayList.

I guess I’m confused to what exactly I should compare the the Julia documentation.

I actually learned Java as one of my first programming langauges so I remember a time when the Iterator interface did not exist and there was no ArrayList.

The main advantage that Java has here is a clear way to define interfaces. Currently, in base Julia interfaces are just documentation. There are, however, are proposals to create interface definitions that can be verified:

I appreciate the feedback. It would be helpful if you can provide some specific constructive feedback now that you know where the documentation is.

  • Should the iterate interface be defined on its own page?
  • Where in the documentation and table of contents would you expect to find this information?
5 Likes

I would not say that the iteration interface is “underspecified”, it perfectly well specified for iteration as performed by the language. It just leaves some questions up to the implementer, and that is fine.

Specifically,

is a universal rule, you can have stateful iterators.

Yes, the manual says that length is

The number of items, if known

so the interface does explicitly demand it in any sane reading.

No, since that is not required, you cannot generally assume that. One can design a perfectly valid (if weird) iterable where x == y does not imply x === y and this affects the iteration.

Yes, it has to be a a tuple, please read the manual:

iterate(iter, state) Returns either a tuple of the next item and next state or nothing if no items remain

(emphasis mine).

I disagree. Some of your questions are clearly answered in the manual, while some are just not explicitly specified or constrained by the interface. That is normal, it is an interface, you can only assume what it requires, and not more. It does not have to explicitly allow or rule out anything else.

1 Like

won’t respond to every point (because I agree with you on some) but just two things:

I have read the manual. But the manual does not match reality. see e.g. this discussion. I would submit a PR to fix the documentation, except that I do not even know what the “correct” docstring should be (bc I find the interface to be ambiguous).

and on stateful iterators, yes I know that Stateful iterators exist, but my point in mentioning them is that the iterator interface as described in the documentation is so vague and nonprescriptive as to how stateful iterators work, such that an unfortunately high fraction of algorithms that attempt to be generic over all iterators do not work as expected for stateful iterators. Just see how length(::Stateful) had to be removed from the public API as late as 1.11, so these are not “old” problems.

4 Likes

Yeah, a core issue here is that there are three or four wildly different levels of documention:

  • Prescriptive demands on how things must be implemented in order to function as expected
  • Descriptive recommendations on how things should be implemented
  • Simplified guidance for new users on how to understand it

But then there’s also:

  • Prescriptive/theoretical ideas of how things should work
  • Descriptive and concrete details on how the implementation does work

All of the above exist in the manual. And are frequently mixed up within the same docstring. But it’s rarely clear which is what.

8 Likes

That implementation is just, strictly speaking, nonconforming; the manual is clear that it has to be a Tuple. But in practice, of course it matter very little as long as people just use the standard destructuring that can cope with anything.

Julia’s interfaces are not (yet) formally specified, so users are free to introduce nonconforming implementations. This does not mean that they specs are not clear though.

FWIW, I think that introducing stateful iterators in an interface that strives to separate the state was just overdesign. The docstring is in the top-10 most difficult to understand parts of the manual, and the wrapper gets very little usage.

1 Like