kbot
1
Hi,
I’d like to identify rows with duplicate values across multiple columns (i.e. colA && colC && colD) with something like
findall( nonunique( df, CONDITION) )
for one column CONDITION is easy e.g. :colA.
How can I do this for multiple, non-contiguous columns?
Thanks for any help,
Give a vector of column names
julia> df = DataFrame(a = [1, 2, 1], b = [4, 5, 4])
3×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 4
2 │ 2 5
3 │ 1 4
julia> nonunique(df, :a)
3-element Vector{Bool}:
0
0
1
julia> nonunique(df, [:a, :b])
3-element Vector{Bool}:
0
0
1
1 Like