Filtering Array Rows By Matching Element Value

I’m a Julia newbie working with some inherited Fortran 90 code for research purposes. One of the things the Fortran code does is write a list of parameters to file for future use, eg:

0 0
0 1
1 0
1 1
2 0

Say the columns represent successive values of the parameters l1 and l2. I would normally read these values into an array, then slice the columns into a 1-D array for each parameter:

using DelimitedFiles

infile = open("pars.txt", "w+")

println(infile, "0 0")
println(infile, "0 1")
println(infile, "1 0")
println(infile, "1 1")
println(infile, "2 0")

params = readdlm("pars.txt")

close(infile)

l1 = params[:, 1]
l2 = params[:, 2]

What I would like to do is filter the array before slicing so that only the rows where l1 == 0 or l1 == 2 are kept. I attempted the solution linked here, but it only works for one filter condition. That is, something like

filter = params[:, 1] .== 0 | 2

new_params = params[filter,:]

only returns the last row (2 0), whereas it should return the first 2 rows and the last row.

Alternatively, if there’s some much more idiomatic way to store, filter, and work with parameter tables like this, please let me know! I only have experience with Fortran’s file handling and data structures, but I’m having enough fun with Julia to want to learn its way of doing things.

Not sure what you attempted here but:

filter = params[:, 1] .== 0 .| params[:, 2] .== 0

Should work.
If all of the dots are annoying you, you can also use @.

filter = @. params[:, 1] == 0 | params[:, 2] == 0

Also a comment: Loops in Julia are fast. So when rewriting a Fortran style program, you can translate the loops to Julia just fine (from a performance perspective). Of course, oftentimes you can write the same code in more idiomatic Julia making it much nicer to read and just as efficient but that’s more difficult for sure :slight_smile:

Thanks for the response! Unfortunately, it doesn’t seem to work. As I said in my initial post, I’m trying to filter on multiple conditions. In this case I want to keep all rows where the first element in the row equals 0 or 2. The solution you gave seems to do an AND operation, returning only the rows where both the first and second element = 0, which is just the first row. The ideal output of params[filter,:] is:

0 0
0 1
2 0

Ah sorry I somehow completely misread what condition you want. Still easy to do:

filter = (params[:, 1] .== 0) .| (params[:, 1] .== 2)

Also note the parenthesis which I also missed. They are required because == has lower precedence than |

Or, to use fewer parentheses, and to emphasise the boolean character here:

filter = params[:, 1] .== 0 .|| params[:, 1] .== 2

It works! Thank you so much. I had tried something like the above, but missed the parentheses and the dot prefix on the OR operator.

I have a generalization for you:

vals = Set([0,2])
filter = in.(params[:, 1], Ref(vals))

A couple of explanations:

  • You can broadcast any function with the . which roughly means that the function will be mapped over its inputs in a sensible manner.
  • infix notation a || b is just syntactic sugar for ||(a, b) so naturally works with . as well
  • If you have an arg that you do not want to broadcast over, then wrapping it in Ref is the canonical way to do it. This works because Ref is essentially a 0-dim container and thus eats the broadcast.
  • if you have a complicated broadcast expression with various Ref etc then you should consider if an explicit loop is perhaps easier to understand (again - no inherent performance penalty for for loops in Julia)

Thank you! I’ll remember your point about loops as well, as I certainly got used to writing them in Fortran.