Iteration over OrdinalRange

Playing around with iterators I noticed that calling

iterate(1:5, 1)

yields in

(2, 2)

while

iterate(collect(1:5), 1)

yields in

(1, 2)

as I would expect it to.
To the best of my knowledge, iterate is supposed to return a tuple containing the next element and the new iteration state. However, its definition for elements of type OrdinalRange returns a tuple (new state, new state). Does anyone know what’s the reasoning behind this?

Thank you very much in advance!

EDIT:
Further, it seems strange that

iterate(1:5, 0)

and

iterate(1:5)

are equivalent, while for “normal” arrays

iterate(collect(1:5), 1)

and

iterate(collect(1:5))

yield the same result.

The answer here is simply that the state is an implementation detail. You can’t expect that the first next state of an object is going to be the number 1. Instead, you need to ask the object what its next state is with the one-argument iterate:

julia> iterate(1:5)
(1, 1)

julia> iterate(collect(1:5))
(1, 2)

julia> iterate(Dict(1=>2))
(1 => 2, 17)

julia> iterate(Dict(1=>2, 3=>4))
(3 => 4, 8)

Manually choosing a state to pass to iterate can be dangerous as you might be giving it something that’s completely invalid and it didn’t expect. In general, you don’t want to touch iteration states — you just pass them along.

3 Likes

This is because UnitRange (such as 1:5) and Vector (such as collect(1:5)) do not define their “iterator state” in the same way.

The documentation for the iterators interface explains that

for i in iter
    # body
end

gets translated into

next = iterate(iter)
while next !== nothing
    (i, state) = next
    # body
    next = iterate(iter, state)
end

In other words, the first call to iterate is performed with only one argument and returns an (element, state) tuple. The state from each call is passed as the second argument in the next call to iterate. But the iterate method for each type can decide exactly how the state is represented.

Let’s look at what happens in both cases you mention:

function test_iter(collection)
    @show next = iterate(collection)
    while next !== nothing
        (i, state) = next
        @show next = iterate(collection, state)
    end
end
julia> test_iter(1:5)
next = iterate(collection) = (1, 1)
next = iterate(collection, state) = (2, 2)
next = iterate(collection, state) = (3, 3)
next = iterate(collection, state) = (4, 4)
next = iterate(collection, state) = (5, 5)
next = iterate(collection, state) = nothing
julia> test_iter(collect(1:5))
next = iterate(collection) = (1, 2)
next = iterate(collection, state) = (2, 3)
next = iterate(collection, state) = (3, 4)
next = iterate(collection, state) = (4, 5)
next = iterate(collection, state) = (5, 6)
next = iterate(collection, state) = nothing

In both cases, the states are consecutive indices. But states in the Vector case are shifted by 1 w.r.t. states in the UnitRange case. This explains both your observations: the state associated to the first element (i.e. the implicit state that is omitted in the first call to iterate) in a UnitRange is 0, whereas the state associated to the first element in a vector is 1.

Does this make sense?


EDIT: it looks like I posted almost simultaneously to mbauman. Keeping this answer anyway in case it would add something to his excellent answer…

2 Likes

Thank you very much for your replies! Now, I do understand what’s happening in both cases.