Collecting zip


#1

what is the status of collecting zip of iterators of different lengths?
Docs state that zip “Run multiple iterators at the same time, until any of them is exhausted.” Simply iterating truncates:

julia> a = 1:5; b = ["a", "b", "c"];

julia> for (x,y) in zip(a, b)
          @show x,y
       end
(x, y) = (1, "a")
(x, y) = (2, "b")
(x, y) = (3, "c")

collecting however…

julia> collect(zip(a,b))
ERROR: DimensionMismatch("dimensions must match")
Stacktrace:
 [1] promote_shape at ./indices.jl:154 [inlined]
 [2] axes(::Base.Iterators.Zip2{UnitRange{Int64},Array{String,1}}) at ./iterators.jl:291
 [3] _similar_for at ./array.jl:533 [inlined]
 [4] _collect at ./array.jl:563 [inlined]
 [5] collect(::Base.Iterators.Zip2{UnitRange{Int64},Array{String,1}}) at ./array.jl:557
 [6] top-level scope at none:0

related issue: https://github.com/JuliaLang/julia/issues/20499

even stranger: if we set b = "abc" collecting works:

julia> b2 = "abc"; collect(zip(a,b2))
3-element Array{Tuple{Int64,Char},1}:
 (1, 'a')
 (2, 'b')
 (3, 'c')

#2

I think this is the relevant issue, and it seems still unresolved.

The difference between the array and the string is that they have return different values for IteratorSize:

julia> Base.IteratorSize(["a", "b", "c"])
Base.HasShape{1}()

julia> Base.IteratorSize("abc")
Base.HasLength()

This suggests the following non-intrusive workaround for your case:

function zip_collect(itr)
    itrsize = Base.IteratorSize(itr)
    itrsize isa Base.HasShape && (itrsize = Base.HasLength())
    Base._collect(1:1, itr, Base.IteratorEltype(itr), itrsize)
end

With result:

julia> zip_collect(zip(1:5, ["a", "b", "c"]))
3-element Array{Tuple{Int64,String},1}:
 (1, "a")
 (2, "b")
 (3, "c")

julia> zip_collect(zip(1:5, "abc"))
3-element Array{Tuple{Int64,Char},1}:
 (1, 'a')
 (2, 'b')
 (3, 'c')