Iterator tool similar to zip (mesh?)

Sometimes I would find it useful to ‘zip’ or ‘mesh’ two or more iterators in a way that each iterator runs in turn, and once one is exhausted it continues to use the remaining ones. For example, when I want to search through integers (both positive and negative). For non-exhausting iterators, this can be achieved with

using Base.Iterators
for n in flatten(zip(countfrom(0), countfrom(-1,-1)))
   ...
   # break if solution found
end

However, if iterators have different (finite) lengths then this will not work (once one of the iterators is exhausted it stops). Is there a clever trick how this can be achieved with stuff in Base.Iterators and IterTools? Or is it worth discussing adding this to IterTools (would mesh be a good name for this)?

So mesh(1:5, 13:18, 22:23) would yield 1 13 22 2 14 23 3 15 4 16 5 17 18 in turn. If the given iterators are finite, the length of the ‘mesh’ iterator is the sum of the individual lengths. The type of the elements it generates is the union of the individual types.

1 Like

Unfortunately, I think it is not so simple. If you did not want the iterators to ever exhaust it would be only necessary to wrap every one of them with a cycle.
However, if I understood correctly you do want the already finished iterators to be skipped and for the outer iterator to stop running after all of the inner iterators are exhausted. You probably need to write your own iterator type for this to work seamlessly.

Yes, you understood correctly, and yes, I also didn’t see a way to cleverly combine existing stuff to achieve this.

So, a follow-up, question: has anyone here ever seen the need for such an operator? It seems very natural to me, but maybe I am mistaken. I am considering adding one to IterTools, if there is interest.

Well, at first blush it’s not that hard to write, is it? Either rewrite zip from scratch, or add on the end the extra values from the longer list?:

Yes? Sorry, not sure what is your point is. Are you saying it is not worth having such a function ‘mesh’ in Base.Iterators or in IterTools?

What would be the most suitable (or a) way to define a function that meets the required case?
I propose a rough idea, just to start the discussion on the merits (and to use the copyto! (…) function which I recently learned about :grinning:)

function padarr(arr,n)
    t=Union{eltype(arr),Missing}
    l=length(arr)
    if n>l 
    res=Array{t}(missing,n)
    copyto!(res,1,arr,1,length(arr))
    else 
        res=arr
    end
    res
end


function zipext(itrs...)
    n=maximum(length,itrs)
    zip((padarr(arr,n) for arr in itrs)...)
end
1 Like

It sounds to me like an intuitive concept but, to be honest, I am having a hard time imagining an exact algorithm that needs it. I think that maybe some heuristics or things that do not lead to an exactly correct result may benefit from it (maybe it is a better option than a random shuffle or a single ranking), but rarely you have an algorithm with differently sized lists that is able to keep some invariant just by interleaving elements while there are elements in some list.

I was thinking in giving it a go just because it seems like a fun iterator to write.

1 Like

Here’s a quick attempt:

interleave(iters...) = Interleaved(iters)
struct Interleaved{T<:Tuple}
    iters::T
end
Base.length(x::Interleaved) = sum(length, x.iters)
function Base.IteratorSize(x::Interleaved)
    traits = map(Base.IteratorSize, x.iters)
    if all(t -> t isa Union{Base.HasLength, Base.HasShape}, traits)
        Base.HasLength()
    elseif any(t -> t isa Base.IsInfinite, traits)
        Base.IsInfinite()
    else
        Base.SizeUnknown()
    end
end

struct _Done end
struct _Init end
function Base.iterate(x::Interleaved, (id, states) = (1, map(_ -> _Init(), x.iters)))
    nextid = mod1(id+1, length(x.iters))

    all(s -> s isa _Done, states) && return nothing
    states[id] isa _Done && return iterate(x, (nextid, states))

    it = if states[id] isa _Init
        iterate(x.iters[id])
    else
        iterate(x.iters[id], states[id])
    end
    if it isa Nothing
        donestate = ntuple(i -> i==id ? _Done() : states[i], length(x.iters))
        return iterate(x, (nextid, donestate))
    else
        y, s = it
        newstate = ntuple(i -> i==id ? s : states[i], length(x.iters))
        return y, (nextid, newstate)
    end
end

interleave(1:5, 'a':'c', (), "AB") |> collect

first(interleave(1:5, 'a':'c', Iterators.cycle([10.0, 100])), 20)

Edit: a much shorter version, with the same struct, by cycling them around:

function Base.iterate(x::Interleaved, (this, rest...) = map(tuple, x.iters))
    it = iterate(this...)
    isnothing(it) && return iterate(x, rest)
    val, st = it
    return val, (rest..., (this[1], st))
end
Base.iterate(x::Interleaved, state::Tuple{}) = nothing

(No idea if these perform well, not carefully checked, etc.)

Edit’: added IteratorSize method to allow infinite iterators.

3 Likes

It is not so much that I have a current use case where I urgently need it, it is more that I had a occational situation where I would have found it convenient (when solving Project Euler or IBM Ponder This problems), and I thought it might be useful for others as well. Very much in the idea of abstracting things out, so you don’t have to think about the details once you need it, and in the spirit of functional programming.

@mcabbott very very nice!

Small edition to produce desired result:

function zipext(itrs...)
    n = maximum(length,itrs)
    Iterators.flatten(skipmissing.(zip((padarr(arr,n) for arr in itrs)...)))
end

it1 = 1:5
it2 = 13:18
it3 = 22:23

collect(zipext(it1, it2, it3))

13-element Vector{Any}:
  1
 13
 22
  2
 14
 23
  3
 15
  4
 16
  5
 17
 18

I believe the Base devs would probably refuse this suggestion and suggest you make the suggestion to IterTools.jl instead. Often, only after a function is in a package for a long time, and there has some advantage in making it a part of the language, the Base devs will consider adding it.

I don’t know how to render the idea in the form of an iterator, but I submit it as an via alternative to get the same final vector


enum(itr)=tuple.(eachindex(itr),itr)
itrs=('a':'d',11:13,1:6)
eitrs=mapreduce(enum, vcat,itrs)
last.(sort(eitrs,by=first))

####

enum(itr)=hcat(eachindex(itr),itr)

eitrs=mapreduce(enum, vcat,itrs)

eitrs[sortperm(eitrs[:, 1]), :][:,2]

Thanks @rocco_sprmnt21, but problem here is that this doesn’t seem to work for infinite interators, such as countfrom(1) from Base.Iterators, unless I am missing something here.

The situations where I found it useful were typically search algorithms which either had a stochastic element, or very large (or infinite) sets to search in, where more likely candidates were earlier in the sets. It would haven’t made sense to exhaust one set before searching the next. This came up in IBM Research | Ponder this problems a few times.

Admittedly, these are probably slightly obscure use cases. However, intellectually, it seems extremely natural to me that once you have the concept of iterators, you want to run through a collection of them, independent of whether they are finite or infinite. So I was a bit puzzled that I couldn’t find an existing concept for this.

wasn’t your question:

?

I do believe he wants a solution that works for both finite and infinite sequences. One comment was commenting on the weakness of one solution, and the other the opposite weakness of another solution.

I think you is probably being affected by “the specialist’s overestimation of the average familiarity of their field”, the problem is that probably I am also affected so it is hard to arrive at a conclusion, XD.

it seems extremely natural to me that once you have the concept of iterators, you want to run through a collection of them, independent of whether they are finite or infinite.

Yes, but this is more of a purely theoretical endeavor, most algorithms do not mix finite and infinite streams, and when they do is mostly by convenience to the programmer and you want a zip like behavior (finishing together with the first finite collection). Often functions are more implemented by their expected use than because of their theoretical soundness, especially in the Base library of a language. Do other languages that you are aware define this concept in the standard library or any mainstream package of them?

I find the most common cases for this kind of traversal are either: all iterators are infinite, or all iterators are finite and have the same size. Both are a little simpler to define.

1 Like

Hi @rafael.guerra , yes, it is as @Henrique_Becker is saying, I was curious for a ‘universal’ solution, independent of whether the iterators are finite or infinite or mixed. I was opening this conversation with the question in mind whether such a ‘mesh’ or ‘interleave’ operator would be something generally useful to be added to ‘IterTools.jl’, say. Sorry if I didn’t express this well enough.

In fact, the code from @mcabbott is exactly what I think would be a great addition to IterTools.jl.

Yeah, I agree with everything you are saying. And indeed, I was trying to find whether Python has something like this, and I couldn’t find it. However, Rust has, see interleave. And Haskell has, of course, a very simple way of doing this (Interleave List of Lists in Haskell - Stack Overflow).