Hi there,
I have an question regarding best practice.
I have a dataframe that I need to filter on several columns based on the value in each column.
However depending on the situation I may be filtering based on an array of elements or a single element. I’ve tired to show the problem below.
MWE:
using DataFrames, DataFramesMeta
df = DataFrame(a = ["el1", "el2", "el3", "el4"], b = [1,2,3,4])
subset1 = ["el1", "el2", "el3"]
subset2 = "el1"
I know how to filter each of these cases individually:
@subset(df, in.(:a, Ref(subset1)))
@subset(df, :a.==subset2)
However I can’t find a way to filter regardless of whether the input is an array or a single element without resorting to if statements, something like this:
function df_filter(df, f)
if typeof(f) <: AbstractArray
dff = @subset(df, in.(:a, Ref(f)))
else
dff = @subset(df, (:a.==f))
end
return dff
end
I also thought about possibly using multiple dispatch. The issue is that in either case it seems to get very messy as the number of columns that I need to filter on increases. E.g. for 3 columns I’d need 8 different methods.
Any help on how to better write this kind of functionality would be appreciated. Thanks!