Nested Comprehensions with Stateful

A Stateful iterator “empties” itself as one iterates through it. The result is the following fun behaviour (if one thinks about for a minute it becomes obvious why this happens):

using Base.Iterators: Stateful
x = [1,2,3,4]

# INTENDED BEHAVIOUR
A = [ a * b for a in x, b in x ]
# 4×4 Matrix{Int64}:
#  1  2   3   4
#  2  4   6   8
#  3  6   9  12
#  4  8  12  16

# UNINTENDED (I THINK?!)
B = [ a * b for a in Stateful(x), b in Stateful(x) ]
# 3×3 Matrix{Int64}:
#  1  4  0
#  2  0  0
#  3  0  0

I understand why this happens and it is easy enough to avoid, but I was simply curious: is this really intended? Or a bug in Stateful? or a bug in nested comprehensions?

P.S.: More generally this will of course happen to every iterator that cannot re-initialize after looping through it.

2 Likes

I’d argue it is a bug in nested comprehensions that [i for i in I, j in 1:2] does not throw an error if the length of I shrinks from j = 1 to j = 2. Everything else is as intended, I guess?

1 Like

really? Would one not expect that [f(i,j) for i in I, j in J] gives the same result as [f(i,j) for i in collect(I), j in collect(J)]?

Also note that if I use nested for loops

for a in Stateful(x), j in Stateful(x)
   # ...
end

the behaviour is exactly as I would have expected.

Just to throw more confusions into the mix :slight_smile:

julia> [a * b for a in Iterators.Stateful(1:4), b in Iterators.Stateful(1:4)] # as in the OP
3×3 Matrix{Int64}:
 1  4  0
 2  0  0
 3  0  0

julia> [a * b for a in Iterators.Stateful(1:4), b in Iterators.Stateful(1:4) if true]
4-element Vector{Int64}:
 1
 2
 3
 4

julia> [a * b for a in Iterators.Stateful(1:4) for b in Iterators.Stateful(1:4)]
16-element Vector{Int64}:
  1
  2
  3
  4
  2
  4
  6
  8
  3
  6
  9
 12
  4
  8
 12
 16

If you know the implementation, you can see that these are all leaky abstractions. Not sure what was the intention of the design, though.

2 Likes

If I understand correctly, the issue is whether

for i in Is(), j in Js()
    [loop body]
end

should be lowered to

# Approach A
cJs = Js()
cIs = Is()
for j in cJs
    for i in cIs
        [loop body]
    end
end

or

# Approach B
cJs = Js()
for j in cJs
    cIs = Is()
    for i in cIs
        [loop body]
    end
end

It seems that for loops currently follow Approach B, while list comprehensions do something even more complicated.

If you want to make your code work with the current definition of list comprehensions, then this should work:

struct Stateless{GI}
    generate_iterable::GI
end

Base.IteratorSize(::Type{Stateless{GI}}) where {GI<:Function} = Base.IteratorSize(GI.instance())
Base.IteratorSize(::Type{Stateless{GI}}) where {GI} = Base.SizeUnknown()
Base.IteratorEltype(::Type{Stateless{GI}}) where {GI<:Function} = Base.IteratorEltype(GI.instance())
Base.IteratorEltype(::Type{Stateless{GI}}) where {GI} = Base.EltypeUnknown()

Base.eltype(::Type{Stateless{GI}}) where {GI<:Function} = Base.eltype(GI.instance())
Base.length(s::Stateless) = length(s.generate_iterable())
Base.size(s::Stateless, dim...) = size(s.generate_iterable(), dim...)

function Base.iterate(s::Stateless)
    iter = s.generate_iterable()
    tmp = iterate(iter)
    isnothing(tmp) && return nothing
    next,state = tmp
    return next, (iter,state)
end

function Base.iterate(s::Stateless, (iter,state))
    tmp = iterate(iter,state)
    isnothing(tmp) && return nothing
    next,state = tmp
    return next, (iter,state)
end

Example:

julia> [(a,b) for a in Stateless(()->Stateful(1:4)), b in Stateless(()->Stateful(1:4))]
4×4 Array{Tuple{Int64,Int64},2}:
 (1, 1)  (1, 2)  (1, 3)  (1, 4)
 (2, 1)  (2, 2)  (2, 3)  (2, 4)
 (3, 1)  (3, 2)  (3, 3)  (3, 4)
 (4, 1)  (4, 2)  (4, 3)  (4, 4)

Thanks for that suggestion.