JuliaDB select/filter multiple string values in one column

question
package

#1

Hello Julia Community,

I have a question regarding usage of JuliaDB.

My table consist with 5 columns and I need to select/filter by one column containing string values.

I am able to do it with one string value:

table = filter(x -> x == "LUX - All-in fee", df, select = :FEE_TYPE)

I have additional 3 string values and I would like to have them in one line like in the following Python/Pandas line:

df = df[df.FEE_TYPE.str.contains('LUX - All-in fee|LUX - IM fee|LUX - ManCo fee|LUX - Perf fee')]

I would appreciate any help from experienced Julia users.

Kind regards

Mac


#2

There’s probably a function for that in juliadb, but with stdlibs

filtervals = ["LUX - All-in fee"; "LUX - IM fee"; "..."]
table = filter(x-> contains(==, filtervals, x), df, select = :FEE_TYPE)

#3

@y4lu Thank you very much, it worked perfectly!


#4

Ah you’re welcome
filter() is the correct function too, ref


#5

If someone is interested, I also made it to work in one line with the boolean operator ||

table = filter(x -> x == "LUX - All-in fee" || x == "LUX - IM fee" || x == "LUX - ManCo fee" || x == "LUX - Perf fee", df, select = :FEE_TYPE)

Cheers


#6

Does it also work with the in operator?


#7

To check that x is one out of three possible strings, you could put the strings into an Array, for example:

filter(x -> x in ["LUX - All-in fee", "LUX - IM fee", "LUX - ManCo fee", "LUX - Perf fee"], df, select = :FEE_TYPE)

If instead you want to check that x contains one of the three strings, it may be worth looking into regular expressions. For example this would be:

filter(r"LUX - (All-in|IM|ManCo|Perf) fee", df, select = :FEE_TYPE)


#8

The solution with ‘in’ operator is very nice and concise :grinning::+1:

Thank you!