# New Iterator from Nested Iterators

Is there any shortcut to create a new iterator from existing nested iterators? I know I can manually define a new type, `length`, `iterate`, and `eltype`, but I’m wondering if there’s a more direct way.

Below is a MWE of a generator that I’d like to turn into a new type to iterate on. Notice that the second iterator is dependent on the value of the first so I don’t think I can simply use `product`. I’m wondering if there’s a use for `flatten` here?

``````testitr(N::Integer) = ((i,j) for i in 1:N for j in i:N)
``````

collect(testitr(5))
15-element Vector{Tuple{Int64, Int64}}:
(1, 1)
(1, 2)
(1, 3)
(1, 4)
(1, 5)
(2, 2)
(2, 3)
(2, 4)
(2, 5)
(3, 3)
(3, 4)
(3, 5)
(4, 4)
(4, 5)
(5, 5)

This can be coded with `flatten`

``````julia> inner(i, N) = ((i,j) for j in i:N)
inner (generic function with 1 method)

julia> outer(N) = Iterators.flatten(inner(i, N) for i in 1:N)
outer (generic function with 1 method)

julia> collect(outer(5))
15-element Vector{Tuple{Int64, Int64}}:
(1, 1)
(1, 2)
(1, 3)
(1, 4)
(1, 5)
(2, 2)
(2, 3)
(2, 4)
(2, 5)
(3, 3)
(3, 4)
(3, 5)
(4, 4)
(4, 5)
(5, 5)
``````

but in this case I would not call it a shortcut.

Cool - I see how `flatten` works, but you’re right that it’s not super helpful. I was hoping it would give me `length` and `eltype` so that iteration was more efficient, but that doesn’t look to be the case.

I’ll play with this a bit more to see if I can leverage the `flatten` methods to save myself having to write a custom `iterate`.

Are you sure it doesn’t have `eltype`? I’m not at my computer to check, but I’d be surprised to learn that `flatten` can’t get an inferred eltype here!

julia> eltype(outer(5))
Any

1 Like
1 Like

The problem is actually with the inner generator itself I think. Is there some way to get this to return the correct type?

julia> eltype(inner(3,4))
Any

After a bit more reading it seems that `Generators` are simply are not type stable in Julia. Given that, it seems very difficult to make `flatten` work for something like this. I think we’re stuck writing custom `Iterators` for anything performance critical.

Yeah, I think iterators are one of the areas of `Base` where the design is lacking. On the other hand, it seems like it would be pretty easy to write a “`Base.Generator` but with `eltype`” package.

3 Likes
``````function Base.eltype(g::Base.Generator{I,F}) where {I,F}
iter_eltype = eltype(I)
f = g.f::F
return_types = Base.return_types(f, (iter_eltype,))
return Union{return_types...}
end

eltype(inner(3,4))
# Tuple{Int64, Int64}
``````

`@code_typewarn` says that accessing the `:f` property is not type stable (which seems weird since the type is in the parameters of the Generator type.

I went back and forth about whether to `Union` all return types or just return `Any` if more than one is returned.

Is there a reason a function like this couldn’t be in Base?

edit: `f` is even parametricly typed in Generator definition. What’s up with the instability?

Probably some kind of bug. Doesn’t happen on nightly, at least.

This sadly returns a `Vector{Any}`, right? Meaning that neither it’s length nor element type is known at compile time. So the destructuring in the next line causes run time dispatch, and `eltype` itself can’t be type stable.

Also, `@allocated` reports that your `eltype` allocates a staggering 5632 bytes EDIT: after everything is compiled as much as possible, it’s merely 704 bytes. That’s still a lot though, considering that `eltype` is usually assumed to be cost-free.

In conclusion, I don’t think using `Base.return_types` is a viable approach here.

So what I didn’t realize is that iterating over Generators is still type stable even though you get `eltype`; I thought that defining `eltype` would unlock `iterate`’s ability to specialize, but I checked and I was wrong.

For what it’s worth though, I went into some internals and got the allocations down another 100 bytes (to 608). I’m sure someone who knows what they’re doing could shave off a lot more by bypassing some of the higher-level functions I’m calling:

``````# Basically just stolen from Base's version
function return_types(@nospecialize(f), @nospecialize(types);
world::UInt=Base.get_world_counter(),
interp::Core.Compiler.AbstractInterpreter=Core.Compiler.NativeInterpreter(world))

(ccall(:jl_is_in_pure_context, Bool, ()) || world == typemax(UInt)) &&
error("code reflection cannot be used from generated functions")

if isa(f, Core.OpaqueClosure)
_, rt = only(Base.code_typed_opaque_closure(f))
return rt::Type
end

if isa(f, Core.Builtin)
argtypes = Any[Base.to_tuple_type(types).parameters...]
rt = Core.Compiler.builtin_tfunction(interp, f, argtypes, nothing)
return Core.Compiler.widenconst(rt)::Type
end

tt = Base.signature_type(f, types)
matches = Base._methods_by_ftype(tt, #=lim=#-1, world)::Vector
if length(matches) == 1
match = only(matches)::Core.MethodMatch
meth = Base.func_for_method_checked(match.method, types, match.sparams)
ty = Core.Compiler.typeinf_type(interp, meth, match.spec_types, match.sparams)
return something(ty, Any)::Type
end
return Any
end

function Base.eltype(g::Base.Generator{I,<:Any}) where I
iter_eltype = eltype(I)
return_type = return_types(g.f, (iter_eltype,))
return return_type
end
``````

I’m not saying this should get added to Base because you make a good point about the allocations vs the promise of `eltype` being zero cost. But it was a fun experiment!

1 Like

If we can’t fix `Generator` `eltype` in `Base`, is it possible to tell the compiler what the `eltype` and `length` are for a specific `Generator` that you create?

Let me post a longer MWE to see if we can eliminate the allocations here. The short version is that I’m creating a performance critical iterator that iterates over index combinations of various lengths. If it seems like a strange way to do this, it’s because it’s still a MWE. I’m trying to nest iterators a few layers deep this way, but this shows the issue I’m seeing with a single nesting. Here’s the inner `Iterator` I’m using for reference.

``````# Based on: https://github.com/Quantum-Many-Body/QuantumLattices.jl/blob/master/src/Toolkit.jl
# improved so there is no allocation on object creation (state is now a tuple)
struct Combinations{M,C}
contents::C
N::Int
Combinations{M}(contents::C) where {M,C} = new{M,C}(contents, length(contents))
end

@inline Base.eltype(::Type{Combinations{M, C}}) where {M, C} = NTuple{M, eltype(C)}
@inline Base.length(c::Combinations{M}) where M = binomial(c.N, M)
Base.iterate(c::Combinations{M}) where M = (M > c.N) ? nothing : (ntuple(i->c.contents[i], Val(M)), nextmstate(ntuple(identity, M), c.N))
Base.iterate(c::Combinations{M}, state) where M = (isempty(state) || (state > c.N-M+1)) ? nothing : (ntuple(i->c.contents[state[i]], Val(M)), nextmstate(state, c.N))

function nextmstate(state::NTuple{M}, N::Int) where {M}
for i in M:-1:1
state = Base.setindex(state, state[i] + 1, i)
(state[i] > N - (M - i)) && continue
for j in (i + 1):M
state = Base.setindex(state, state[j - 1] + 1, j)
end
break
end
return state
end
``````

Then I iterate using the following

``````# create a tuple of 0s and 1s of length N with 1s at `nzindices`
function sptuple(::Val{N}, nzindices::NTuple{M,Int})::NTuple{N,Int} where {N,M}
res = ntuple(i -> zero(Int), N)
for i in eachindex(nzindices)
res = Base.setindex(res, one(Int), nzindices[i])
end
return res
end

function sptuple(::Val{N}, ::Tuple{})::NTuple{N,Int} where {N}
return ntuple(i->zero(Int), N)
end

# create an inner and outer Iterator using flatten.
inner(::Val{N}, ::Val{M}) where {N,M} = (sptuple(Val(N), j) for j in Combinations{M}(ntuple(identity, N)))
outer(::Val{N}) where {N} = Iterators.flatten(inner(Val(N), Val(i)) for i in 0:N)
``````

Benchmarking, I see the following:

``````using BenchmarkTools
# single allocation on collect as expected on inner iterator
@btime collect(inner(Val(5), Val(3)))  # 110.381 ns (1 allocation: 496 bytes)
# I need this to also a be a single allocation. expect around ~300ns based on inner benchmark
@btime collect(outer(Val(5)))  # 5.826 μs (166 allocations: 13.27 KiB)
``````

My hunch is that if I could somehow define `eltype` and `length` methods for `inner` and `outer` that this would work. Something like:

``````Base.eltype(inner(...)) where {N,M} = NTuple{N,Int}
Base.eltype(outer(...)) where {N} = NTuple{N,Int}
Base.length(inner(...)) where {N,M} = length(Combinations{M}(ntuple(identity, N)))
Base.length(outer(...)) where {N} = prod(i->length(inner(Val(N), Val(i))), 0:N)
``````

Without looking at your code in depth yet:

1. I think it would be simple to create a replacement for `Base.Generator` that would have the element type as a type parameter, and use it to implement `eltype`. Maybe I do that later. We could even call `Base.return_types` when constructing such a generator (presumably the construction itself is not performance-sensitive).
2. However, IMO Julia’s standard iterator protocol is itself not very good. It’s fine for simple cases, but I don’t think it enables good performance for complicated compositions of iterators. One of the problems is that the iteration protocol features the iterator state, but a user of the iterator type has no way of getting the type of the state (unlike `eltype`, there’s no `iterstatetype`). However none of this is necessarily a problem if you’re the author of all iterator types in question.
2 Likes
1. my concern with calling `Base.return_types` is that I think it will get called every time `inner` is called - once per outer iteration. Won’t this create similar performance issues to what I’m seeing now? In other words construction of the inner iterators are performance sensitive which is why I had to write my own `Combinations`.

2. You’re absolutely right. I own the `Iterators` here and could absolutely write my custom type and `iterate` function. I’m trying to avoid this for two reasons. First, it seems like a lot of effort if all I want to do it create a new iterator by nesting two existing ones. Second, it makes the code really hard to follow. The fact that I’m creating an iterator to essentially do

``````for i in 1:N
for j in F{i}()
for k in G{i}(j)
# do stuff
end
end
end
``````

is quickly lost.

2 Likes

In ExpandNestedData.jl, I have a lazy implementation of repeat, cycle, and vcat using callable structs to route an index back to a source iterator using some division and modulo arithmetic. I found that nesting dozens of types from `Iterators` really hurt both type stability and execution time. I never got into the details to figure out why though.

PS if you look at the code, I’m using invoke latest right now which I know sucks. I have a solution in the works but haven’t had time to clean it up.

1 Like

Regarding your specific problem, the collection represented by `outer(Val(5))` is (I now realize) merely the 2^5 integers from 0 to 2^5 - 1, represented as strings of bits. You could have the iterator state literally be just a single `UInt` and `iterate` would just increment the state and extract the bits.

So this should be a faster implementation of your `outer` iterator (EDIT: the order is actually not the same, not sure if that matters):

``````to_bitstring(m::Val, n::Integer) = ntuple(
let n = n
i -> (n >>> (i - 1)) % Bool
end,
m,
)

to_bitstring(::Val{m}) where {m} =
(n::Integer) -> to_bitstring(Val(m), n)

# Use a static range (I think the packages `Static` and
# `StaticNumbers` support those) instead of `0:(2^m - 1)` for
# more performance?
int_to_bitstring_iterator(::Val{m}) where {m} =
Iterators.map(to_bitstring(Val(m)), 0:(2^m - 1))
``````

EDIT: yes, it’s quite a bit faster:

``````julia> using BenchmarkTools

julia> c(it::F, n::Val) where {F} = collect(it(n))
c (generic function with 1 method)

julia> @btime c(int_to_bitstring_iterator, Val(10));
2.767 μs (1 allocation: 10.12 KiB)

julia> @btime c(outer, Val(10));
134.662 μs (4172 allocations: 725.50 KiB)
``````

Repo on Gitlab.com, PR to the General registry.

3 Likes

Thank you for the help! I am aware that they are simpler implementations for my MWE, but I’m trying to represent a more complicated problem that cannot be solved so easily.

I’ve been playing with your implementation of `Generator` in `LazyMapWithElType` a bit, trying to get it to work for my example above. I haven’t been able to figure out how to use it to nest two `Generators` either with `Iterators.flatten` or otherwise. Is there a specific way you were imaging this getting used to solve that issue?