Subsetting dataframe with multiple conditions

Why should i wrap every condition in a pair of parentheses when subsetting data? Example dataframe as below:

d = DataFrame(a = [1, 2, 3], b = ["x", "y", "z"])
d2 = d[(d.a .> 1) .| (d.b == "z"), :]

If I remove parentheses, error pops out.

d2 = d[d.a .> 1 .| d.b == "z", :]
# ERROR: LoadError: MethodError: no method matching |(::Int64, ::String)

But no error if i write conditions this way:

1 > 2 | 2 > 3
# false

This was answered on SO here:

https://stackoverflow.com/questions/70845261/julia-subsetting-dataframe-with-multiple-conditions

2 Likes

Just to because your example here is a bit different than on SO: when you say

no error if I write conditions this way
1 > 2 | 2 > 3

you’re not getting an error, but this probably doesn’t do what you think it does:

julia> 1 > 2 | 3 > 2
false

so still you need either higher precedence || or parentheses.

My personal preference is to always disambiguate with parentheses, which is especially beneficial in a language like Julia with a large number of unicode infix operators which make it basically impossible to remember the precedence hierarchy.

2 Likes