Filtering rows of a 2-d array


#1

Hi folks,
I guess you can help me regarding this (quite novice) question. The fact is that I have a 2d array of several thousand rows and 5 columns, and I need to filter all the rows of the original array when the corresponding elements in the first and second column, for a given row, equals to some values (Ex: col 1=2 and col_2=0).
I know how to do that in two lines (first filtering for the first column, and then filtering on the resulting array over the second column). SO I know how to do that but… the question is: can I do the filtering in a single, simple line?

Best regards and thanks,

Ferran.


#2

Perhaps a comprehension with two conditions is easy to write in a single line:

x = rand(1:5, 10_000, 5)
x_filtered = [x[i,:] for i=1:10_000 if x[i,1]==1 && x[i,2]==2]

#3

Your data layout is not very good for this task.

If you can live with the alternative storage format, are using 0.6, and don’t overly care about speed then I’d go with

using BenchmarkTools
@btime x=rand(1:2, 5, 100_000);
@btime y=reinterpret(NTuple{size(x,1),Int}, x,(size(x,2),));
ff=t->(t[1]==1 && t[2]==2);
@btime z=filter(ff, y);
@btime res=reshape(reinterpret(Int,z), size(x,1),:);

10.662 ms (2 allocations: 3.81 MiB)
1.055 ��s (7 allocations: 304 bytes)
1.552 ms (15 allocations: 2.50 MiB)
691.373 ns (5 allocations: 192 bytes)

#4

A single line isn’t always better than two lines!

x = rand(1:5, 10_000, 5)
filtered = x[(x[:,1] .== 1) .& (x[:,2] .==2),:]

#5

Hi again,
thanks to you all :slight_smile: I can say that the version I prefer is the last one, as it preserves the structure of the original array and is easy and simple.
Thanks folks, this forum is great.
Best,
Ferran