Is there a function that would allow selecting rows/columns with isin?
df = DataFrame(Column1 = [“a”, "b, “c”, “d”, “e”, “f”],
Column2 = [1, 2, 3, 1, 2, 3],
)
keep = [“a”, “c”, “e”]
df[isin(df.Column1, keep), :]
Is there a function that would allow selecting rows/columns with isin?
df = DataFrame(Column1 = [“a”, "b, “c”, “d”, “e”, “f”],
Column2 = [1, 2, 3, 1, 2, 3],
)
keep = [“a”, “c”, “e”]
df[isin(df.Column1, keep), :]
There is a function to do this. You would just need to do
in.(df.Column1, Ref(keep))
but with questions like these, usually what people really need is an innerjoin
.
or in.(df.Column1, Ref(Set(keep)))
if keep
and :Column1
are long and performance matters.
However, as @pdeffebach commented innerjoin
is optimized to do such operations efficiently (e.g. if your data would be sorted it would take advantage of this fact and such thing is ignored by in
).
thank you both!
I won’t let one of these threads go by without mentioning my favourite use of the Fix2 syntax:
in(keep).(df.Column1)
(or Set(keep)
as Bogumil said)