DataFrame isin operation

Is there a function that would allow selecting rows/columns with isin?

df = DataFrame(Column1 = [“a”, "b, “c”, “d”, “e”, “f”],
Column2 = [1, 2, 3, 1, 2, 3],
)
keep = [“a”, “c”, “e”]
df[isin(df.Column1, keep), :]

There is a function to do this. You would just need to do

in.(df.Column1, Ref(keep))

but with questions like these, usually what people really need is an innerjoin.

3 Likes

or in.(df.Column1, Ref(Set(keep))) if keep and :Column1 are long and performance matters.

However, as @pdeffebach commented innerjoin is optimized to do such operations efficiently (e.g. if your data would be sorted it would take advantage of this fact and such thing is ignored by in).

1 Like

thank you both!

I won’t let one of these threads go by without mentioning my favourite use of the Fix2 syntax:

in(keep).(df.Column1)

(or Set(keep) as Bogumil said)

1 Like