Query.@join with repeated names

heliosdrm · April 24, 2019, 2:05pm

Say I have this:

julia> df1 = DataFrame(id=[1,2,3,4], a=[1.,2.,3.,4.])
4×2 DataFrame
│ Row │ id    │ a       │
│     │ Int64 │ Float64 │
├─────┼───────┼─────────┤
│ 1   │ 1     │ 1.0     │
│ 2   │ 2     │ 2.0     │
│ 3   │ 3     │ 3.0     │
│ 4   │ 4     │ 4.0     │

julia> df2 = DataFrame(id=[1,3,4], b=["one", "thre","four"])
3×2 DataFrame
│ Row │ id    │ b      │
│     │ Int64 │ String │
├─────┼───────┼────────┤
│ 1   │ 1     │ one    │
│ 2   │ 3     │ thre   │
│ 3   │ 4     │ four   │

I can join them with Query.jl as follows:

julia> df1 |> @join(df2, _.id, _.id, {_.id, _.a, __.b}) |> DataFrame
3×3 DataFrame
│ Row │ id    │ a       │ b      │
│     │ Int64 │ Float64 │ String │
├─────┼───────┼─────────┼────────┤
│ 1   │ 1     │ 1.0     │ one    │
│ 2   │ 3     │ 3.0     │ thre   │
│ 3   │ 4     │ 4.0     │ four   │

But if in df2 the last column bears the same names as df1 — i.e. a, this does not work because the names are repeated. This is managed by DataFrames.join with the keyword argument makeunique, such that with that new df2 I get:

julia> join(df1, df2, on=:id, makeunique=true)
3×3 DataFrame
│ Row │ id    │ a       │ a_1    │
│     │ Int64 │ Float64 │ String │
├─────┼───────┼─────────┼────────┤
│ 1   │ 1     │ 1.0     │ one    │
│ 2   │ 3     │ 3.0     │ thre   │
│ 3   │ 4     │ 4.0     │ four   │

However I like the possibility of selecting columns provided by Query.@join. Is there something similar in Query to handle repeated column names?

mcabbott · April 24, 2019, 5:39pm

This would be nice, I made an issue for this: https://github.com/queryverse/Query.jl/issues/250

Topic		Replies	Views
SQLite: query with duplicate column names \|> DataFrame Data dataframes , sqlite	15	5291	March 29, 2022
DataFrame: Cannot have duplicated names for indices General Usage dataframes , namedarrays	7	1274	March 29, 2022
Iterate across two DataFrames using Query.jl New to Julia query	5	1165	November 19, 2018
DataFrame: New columns with names from existing column General Usage question , dataframes	1	294	August 10, 2023
Joining two dataframes of different size only when colum values are coincident New to Julia dataframes	2	332	February 1, 2023

Query.@join with repeated names

Related topics