Using nonunique() with multiple dataframe columns

Hi,

I’d like to identify rows with duplicate values across multiple columns (i.e. colA && colC && colD) with something like

findall( nonunique( df, CONDITION) )

for one column CONDITION is easy e.g. :colA.

How can I do this for multiple, non-contiguous columns?

Thanks for any help,

Give a vector of column names

julia> df = DataFrame(a = [1, 2, 1], b = [4, 5, 4])
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     1      4

julia> nonunique(df, :a)
3-element Vector{Bool}:
 0
 0
 1

julia> nonunique(df, [:a, :b])
3-element Vector{Bool}:
 0
 0
 1
1 Like

Thanks, simple!