Why is getindex not implemented for Iterators.ProductIterator?

I find this behavior strange:

s = Iterators.product(1:9, 10:19)
c = collect(s)
c[5] # works: (5, 10)

s[5] # doesn't work
ERROR: MethodError: no method matching getindex(::Base.Iterators.ProductIterator{Tuple{UnitRange{Int64}, UnitRange{Int64}}}, ::Int64)

Also, first(s) is defined, but last(s) is not. Isn’t that unexpected?

No because iterators are all written in order to work with iterables of unknown length (potentially infinite) where you don’t know what the last element is. They are also meant to iterate, not to index because some iterators or generators can only be accessed one element after another

2 Likes

So for Cartesian products created with Iterators.product I always have to collect them before accessing a particular element?

You can also iterate until you get to the element you’re interested in, e.g with

julia> first(Iterators.drop(s, 4))
(5, 10)
1 Like

And maybe there are other lazy product objects, where you can index as long as the underlying objects are also indexable. That would actually not be hard to implement

1 Like

Right now, the iterators in the Iterators module support only iteration (I think). In this case, indexing could be implemented if all of the arguments have a length and are indexable. Similarly, last could be implemented if all arguments implement last. That seems reasonable to me.

You can find n-th element of the Iterator, by doing n iterations. But the problem is that at this point Iterator already lost is first state, so you can’t do it twice (you have to reconstruct it to be able to iterate to the same index again).

Also, you should take into account that iteration usually involve some additional calculations, so if you need to access the same index more than once, or you need random access, it is more efficient to materialize collection. If you need to access elements only once, than iteration over Iterator is enough and getindex is not needed.

But it may be of interest not an Iterator, but custom structure, that can store vectors and provide product-like access. Then you would not needed to materialize this big matrix. Probably such a package already exists.

1 Like

My use case is drawing randomly with replacement from a Cartesian product of 6 dimensions, around 12k elements. I thought I could just pass the Iterator to rand. It turns out that rand doesn’t have a method for ProductIterator, so I started playing around and found that behavior. I guess I can just generate and store the Cartesian product once and then draw from that.

you can make your own pretty easily because in this case you know the “things” that make up the iterator are individually sample-able:

julia> const BIP = Base.Iterators.ProductIterator
Base.Iterators.ProductIterator

julia> myrand(itr::BIP{NTuple{N, T}}) where {N, T<:AbstractArray} = rand.(itr.iterators)

julia> s = Iterators.product(1:9, 10:19)
Base.Iterators.ProductIterator{Tuple{UnitRange{Int64}, UnitRange{Int64}}}((1:9, 10:19))

julia> myrand(s)
2-element Vector{Int64}:
  4
 15
1 Like

Thank you! I also added a new method for drawing n elements:

function myrand(itr::BIP{NTuple{N, T}}, n) where {N, T<:AbstractArray}
    arr = Array{Any}(undef, n)
    for i in 1:n
        arr[i] = myrand(itr)
    end
    return arr
end

I’m worried about the instability of Any array. I tried creating the array with elements NTuple{N, T}, but I get an error if I go that route ERROR: MethodError: Cannot convert an object of type Int64 to an object of type UnitRange{Int64}.

julia> function myrand(itr::BIP{NTuple{N, T}}, n::Int) where {N, T<:AbstractArray}
           elts = eltype.(itr.iterators)
           res = Vector{Tuple{elts...}}(undef, n)
           for i in eachindex(res)
               res[i] = Tuple(myrand(itr))
           end
           res
       end

julia> myrand(s,2)
2-element Vector{Tuple{Int64, Int64}}:
 (2, 19)
 (7, 14)

1 Like

Nice. Is it necessary to wrap myrand(itr) in Tuple when assigning to res[i]?

it’s because I defined the result to be a Vector of Tuple (this resembles a vector of coordinates). But our previous myrand(s) returns an vector.

I think you are calling another method, before your edits. The way it’s written now myrand(s) returns a Tuple.

this returns a vector right? what do you mean before?

What I meant was that I noticed you made some quick edits, so I thought you had those methods already loaded into Julia and perhaps one of those returns a vector. But now it returns a Tuple:

julia> myrand(s)
(5, 18)

oops, you’re right. So yes, we don’t need Tuple() anymore in the n-samples case.

For completeness, let me post everything:

julia> const BIP = Base.Iterators.ProductIterator
Base.Iterators.ProductIterator

julia> myrand(itr::BIP{NTuple{N, T}}) where {N, T<:AbstractArray} = rand.(itr.iterators)
myrand (generic function with 1 method)

julia> s = Iterators.product(1:9, 10:19)
Base.Iterators.ProductIterator{Tuple{UnitRange{Int64}, UnitRange{Int64}}}((1:9, 10:19))

julia> myrand(s)
(7, 14)

julia> function myrand(itr::BIP{NTuple{N, T}}, n::Int) where {N, T<:AbstractArray}
           elts = eltype.(itr.iterators)
           res = Vector{Tuple{elts...}}(undef, n)
           for i in eachindex(res)
               res[i] = myrand(itr)
           end
           res
       end
myrand (generic function with 2 methods)

julia> myrand(s, 2)
2-element Vector{Tuple{Int64, Int64}}:
 (6, 11)
 (6, 16)

Note: this has the caveat that the AbstractArrays make up the Cartesian product have to have the same type, which is probably fine if they represent different dimensions of something.

1 Like