Hello Julia Community,
I have a question regarding usage of JuliaDB.
My table consist with 5 columns and I need to select/filter by one column containing string values.
I am able to do it with one string value:
table = filter(x -> x == "LUX - All-in fee", df, select = :FEE_TYPE)
I have additional 3 string values and I would like to have them in one line like in the following Python/Pandas line:
df = df[df.FEE_TYPE.str.contains('LUX - All-in fee|LUX - IM fee|LUX - ManCo fee|LUX - Perf fee')]
I would appreciate any help from experienced Julia users.
Kind regards
Mac
y4lu
March 23, 2018, 10:30am
2
There’s probably a function for that in juliadb, but with stdlibs
filtervals = ["LUX - All-in fee"; "LUX - IM fee"; "..."]
table = filter(x-> contains(==, filtervals, x), df, select = :FEE_TYPE)
1 Like
@y4lu Thank you very much, it worked perfectly!
y4lu
March 23, 2018, 1:01pm
4
Ah you’re welcome
filter()
is the correct function too, ref
If someone is interested, I also made it to work in one line with the boolean operator ||
table = filter(x -> x == "LUX - All-in fee" || x == "LUX - IM fee" || x == "LUX - ManCo fee" || x == "LUX - Perf fee", df, select = :FEE_TYPE)
Cheers
tk3369
March 23, 2018, 3:21pm
6
Does it also work with the in
operator?
piever
March 23, 2018, 3:28pm
7
To check that x
is one out of three possible strings, you could put the strings into an Array
, for example:
filter(x -> x in ["LUX - All-in fee", "LUX - IM fee", "LUX - ManCo fee", "LUX - Perf fee"], df, select = :FEE_TYPE)
If instead you want to check that x
contains one of the three strings, it may be worth looking into regular expressions . For example this would be:
filter(r"LUX - (All-in|IM|ManCo|Perf) fee", df, select = :FEE_TYPE)
1 Like
The solution with ‘in’ operator is very nice and concise
Thank you!
Hi everyone,
I also tried this solution but didn’t work for me.
sing DataFrames, Pkg, CSV, Gadfly, HypothesisTests, Statistics
data = CSV.read("/Users/home/Documents/MP blog 2021/Data/UEFA champions league/data_2022_AH.csv", DataFrame, normalizenames=true)
first(data,5)
df_PS = select(data, :Equipo, :Score, :Remate, :Remate_arco, :Posesion, :Pases, :Precision_pases, :Faltas, :Corners)
I was able to run my code properly.
But when I try this
filter_vals =["Paris Saint-Germain"; "Sevilla"; "Manchester City"; "Ajax"]
table = filter(x-> contains(==, filter_vals, x), df_PS, select = :Equipo)
An error appear that select isn’t found.
Any help would be highly appreciated.
nilshg
February 11, 2022, 8:34am
10
You aren’t working with JuliaDB, so you should probably start a new thread when asking about unrelated packages (in any case it’s advisable to start a new thread rather than resurrect a four year old one).
That said it sounds like you’re just looksing for something like
df_PS[in(filter_vals).(df_PS.Equipo), :]
1 Like
Thank you very much! Sorry for the confusion. If I have other questions I would start a new thread.
It worked!
1 Like