Say I have this:
julia> df1 = DataFrame(id=[1,2,3,4], a=[1.,2.,3.,4.])
4×2 DataFrame
│ Row │ id │ a │
│ │ Int64 │ Float64 │
├─────┼───────┼─────────┤
│ 1 │ 1 │ 1.0 │
│ 2 │ 2 │ 2.0 │
│ 3 │ 3 │ 3.0 │
│ 4 │ 4 │ 4.0 │
julia> df2 = DataFrame(id=[1,3,4], b=["one", "thre","four"])
3×2 DataFrame
│ Row │ id │ b │
│ │ Int64 │ String │
├─────┼───────┼────────┤
│ 1 │ 1 │ one │
│ 2 │ 3 │ thre │
│ 3 │ 4 │ four │
I can join them with Query.jl as follows:
julia> df1 |> @join(df2, _.id, _.id, {_.id, _.a, __.b}) |> DataFrame
3×3 DataFrame
│ Row │ id │ a │ b │
│ │ Int64 │ Float64 │ String │
├─────┼───────┼─────────┼────────┤
│ 1 │ 1 │ 1.0 │ one │
│ 2 │ 3 │ 3.0 │ thre │
│ 3 │ 4 │ 4.0 │ four │
But if in df2
the last column bears the same names as df1
— i.e. a
, this does not work because the names are repeated. This is managed by DataFrames.join
with the keyword argument makeunique
, such that with that new df2
I get:
julia> join(df1, df2, on=:id, makeunique=true)
3×3 DataFrame
│ Row │ id │ a │ a_1 │
│ │ Int64 │ Float64 │ String │
├─────┼───────┼─────────┼────────┤
│ 1 │ 1 │ 1.0 │ one │
│ 2 │ 3 │ 3.0 │ thre │
│ 3 │ 4 │ 4.0 │ four │
However I like the possibility of selecting columns provided by Query.@join
. Is there something similar in Query to handle repeated column names?