Compound dataframe filtering with negated boolean expression

notpeerreviewed · June 3, 2019, 8:33am

I’m trying to reproduce a compound filter but it involves negation of a two part specification.

I’d normally do something like this in R, using dplyr

filter(!(Fuel == "Gas" & Field == "Maari"))

In simple terms, I want to be able to filter my dataframe to exclude all instances where the combination of Fuel == Gas, and Field == Maari. I have been able to get a single negated boolean to work, but I can’t seem to get the negated compound expression to evaluate correctly.

I’ve tried this expression, and a few variations, but I keep getting an error saying there is no method matching !(::BitArray{1}). If I remove the ! then the expression returns the expected filtered result.

test[!((test.Fuel .== "Gas") .& (test.Field .== Symbol("Maari"))), :]

Can someone suggest how I might correctly negate the expression above?

Cheers

Jeff

nilshg · June 3, 2019, 8:37am

I think you’re looking for .! (I asked a similar question a while ago, search for negating boolean array)

notpeerreviewed · June 3, 2019, 8:40am

You’re a champion! I can’t believe I didn’t think to try that.

It is quite a change moving from R to Julia.

Thanks so much.

Jeff

nilshg · June 3, 2019, 8:59am

Glad that worked! It is and it isn’t to me - I’d say coming from Python is easier, but most often the issues you’ll experience are method errors down to using an inappropriate type. For me in R I seem to suffer from the same problems, albeit they are harder to diagnose as the type information is not great, different things have different indexing patterns eg.
I guess coming to a new language error messages are always cryptic to some extent, but having used both R and Julia in parallel over the past few months to me there’s no comparison in what’s easier to debug

Iulian.Cioarca · July 29, 2022, 12:25pm

I have a similar usecase. Let’s take:

df=DataFrame(x=[1,2,3])

I tried to use: subset(df, "x"=>x -> x.==1) in order to make a simple filter.
How can I modify the logic to get as subset a dataframe containing the values 1 and 2?
I tried subset(df, "x"=>x -> x.==[1,2]) but it gives an error. I also have to mention that the number of values I want to filter to is arbitrary, not necessarily 2 as in this example.

rocco_sprmnt21 · July 29, 2022, 12:50pm

try with this


df=DataFrame(x=rand(1:10,10))

this=Set([1,2,3])

subset(df, :x=>ByRow(x->x∈this))
subset(df, :x=>x->x .∈ Ref(this))

Topic		Replies	Views
Subset differences between \|\| and \| operators New to Julia question	2	127	August 23, 2024
Boolean false filter in update selection query General Usage question , dataframes	5	524	September 7, 2020
Subset does not contain certain string New to Julia question	2	636	November 26, 2021
Subsetting boolean columns and multiple assignments Data dataframes	5	1074	September 5, 2021
Invert a row selection in DataFramesMeta Data dataframes	3	259	April 24, 2023

Compound dataframe filtering with negated boolean expression

Related topics