Creating an array of tuples (why doesn't filter work on a zip?)

I’m pretty sure the answer is that a zip type, is not an Array, BitArray or AbstractArray and that’s what my problem is.

I also read the definition for zip and it “runs multiple iterators”, also resultant type is not an array, so that takes care of why filter doesn’t work on a zipped thing directly.

which of course would make my question: how do i create an array of tuples from two arrays ?

then the obvious came to me

[ (x,y) for (x,y) in zip(a,b) ]

and I’m thinking, Shirley there’s a better way… I picked a bad week to give up nootropics.

julia> a=collect(1:5)
b5-element Array{Int64,1}:=
 1
 2
 3
 4
 5

julia> b=randn(5)
5-element Array{Float64,1}:
  1.622863416635525 
 -0.2613170739500653
 -0.2399681094935552
 -0.9209180059024933
 -0.9645905168069914

julia> xy=zip(a,b)
Base.Iterators.Zip{Tuple{Array{Int64,1},Array{Float64,1}}}(([1, 2, 3, 4, 5], [1.62286, -0.261317, -0.239968, -0.920918, -0.964591]))

julia> filter((x,y)->x>5, xy)
ERROR: MethodError: no method matching filter(::getfield(Main, Symbol("##5#6")), ::Base.Iterators.Zip{Tuple{Array{Int64,1},Array{Float64,1}}})
Closest candidates are:
  filter(::Any, ::Array{T,1} where T) at array.jl:2351
  filter(::Any, ::BitArray) at bitarray.jl:1710
  filter(::Any, ::AbstractArray) at array.jl:2312
  ...
Stacktrace:
 [1] top-level scope at none:0

I cannot answer your question about why filter doesn’t have a method for iterators, but a few simpler alternatives to your list comprehension are

[ x for x in zip(a,b) ]

and

collect(zip(a,b))
1 Like

that’s much better :slight_smile:

it turns out that my original error was due to trying to operate on a zip AND because you can’t use (x,…) in an anonymous function :frowning:

julia> filter((x,y)->x >4, xy)
ERROR: MethodError: no method matching (::getfield(Main, Symbol("##3#4")))(::Tuple{Int64,Float64})
Closest candidates are:
  #3(::Any, ::Any) at REPL[3]:1
Stacktrace:
 [1] mapfilter(::getfield(Main, Symbol("##3#4")), ::typeof(push!), ::Array{Tuple{Int64,Float64},1}, ::Array{Tuple{Int64,Float64},1}) at .\abstractset.jl:340
 [2] filter(::Function, ::Array{Tuple{Int64,Float64},1}) at .\array.jl:2351
 [3] top-level scope at none:0

julia> filter(x->x[1] >4, xy)
6-element Array{Tuple{Int64,Float64},1}:
 (5, -1.4070836855785198) 
 (6, -1.1832075841742866) 
 (7, -1.7970550232653926) 
 (8, -1.0736587226498235) 
 (9, -0.42599362336170415)
 (10, -1.7261573057529163)

There are a couple of things going on. Firstly, it appears that filter doesn’t support Zip iterators. That’s a bit surprising, maybe? You’ll have to collect the Zip object into a vector of tuples.

Secondly, iterating over a collection of tuples gives you a tuple at a time, and you cannot destructure them like that.

So,

xy = zip(1:5, rand(5))  # don't collect the range
filter(x->x[1]>3, collect(xy))  # x is a tuple
# or you can do this
filter(((x, y),)->x>3, collect(xy))  # notice the extra parens

You could also use a comprehension:

[tup for tup in xy if tup[1]>3]  # this works with the `Zip` object

If you want to filter an iterator to produce another iterator (not an array), you can use Iterators.filter, e.g.

Iterators.filter(((x,y),) -> x > 0, zip(a,b))

You can collect the result to get an array, but you can also loop over it (e.g. for (x,y) in Iterators.filter(...)) without actually allocating the filtered collection.

8 Likes

Just for completeness, there is ((x, y) for (x, y) in zip(a, b) if x > 0) whose filtering part is lowered to an Iterators.filter object. Repeating (x, y) twice is somewhat ugly though.

3 Likes