Does length(itr) need to be invariant? (stateful iterator)

I have a calculation that tracks a large state for a cohort of economic agents that have a finite lifetime. A certain summary statistic is implemented with the iteration protocol.

If you imagine c below to be something large, black boxy, and mutable, the following MWE should convey the idea:

mutable struct ExampleItr
    c::Int
end

function Base.iterate(itr::ExampleItr, _::Nothing = nothing)
    iszero(itr.c) && return nothing
    c = itr.c
    itr.c -= 1
    c, nothing
end

Base.length(itr::ExampleItr) = itr.c

Base.eltype(::Type{ExampleItr}) = Int

The use case is the following: the functions that use the iterator may copy the whole thing at a certain point, and restart from there with different parameters (not included above). Eg

What I find problematic is that length depends in the state. It is fine it it is understood that the users of the interface may call length at the very beginning, and not after. It is not if it can be called throughout the iteration and is expected to return the same value.

The iterator protocol does not say when length may be called, so in theory it could be called anytime.

1 Like

The alternative is

Base.IteratorSize(::Type{ExampleItr}) = Base.SizeUnknown()

which is always correct, it just precludes certain optimizations.

1 Like

Set SizeUnknown() and leave length undefined:

2 Likes

This makes a lot of sense, I am wondering it this point should be mentioned in the manual (very briefly). I can make a trivial PR.

Very little is documented about stateful iterators. The specific implementation Iterators.Stateful is one of the most mysterious parts of Julia for me. I am not sure it gets a lot of usage.

4 Likes

The change is now in the release notes for v1.11:

1 Like