How would I set a generalized condition that can be passed in to index a dataframe without explicitly defining the dataframe name

Here is what I have now:

df = df[df[Symbol(“PURCHASE”)].==“SUCCESS”, :]

what I’d like to do is set a success condition variable and have something that works like the code below:

success_condition = Symbol(“PURCHASE”)].==“SUCCESS”

df = df[df[success_condition]]

The key is that it should be able to be passed into any dataframe, not just one named df. I need it generalized if possible.

Thank you for your time and help!

First, please remember to quote your code with backticks.

Strictly speaking I’m not sure you can do what you want. The Symbol("PURCHASE") only refers to a column of a DataFrame when it is actually passed in to one as df[!,Symbol("PURCHASE")]. Outside of that it will just look like :PURCHASE, which is just a variable of type Symbol.

What you could do is to define a function like so

findsuccess(v) = findall( x -> x == "SUCCESS",v)

and then pass to it the DataFrame column findsuccess(df[!,Symbol("PURCHASE")])

1 Like

An easy way is to use Query.jl:

func = x -> x.PURCHASE == "SUCCESS"
df |> @filter(func(_))

Or just filter(x -> x.PURCHASE == "SUCCESS", df). You can store x -> x.PURCHASE in a variable to apply it to any data frame.