JuliaDB: select columns from 2 tables and join based on 2 keys

Suppose that we have 2 tables, each contains multiple columns. The first table has (id, date, x1, x2, x3) and the second has (id, date, y1, y2, y3). Is it possible to create a new table by selecting columns (id, date, x1, y2, y3) from the 2 original tables and inner joining them on (id, date)?

The reason I am asking this is that this procedure is very convenient in SQL. I wish there is a similar thing in JuliaDB.

More generally, does JuliaDB support arbitrary join conditions as does SQL with the JOIN <table> ON <condition> clause?

1 Like

https://juliadata.github.io/JuliaDB.jl/latest/api/#Base.join-Tuple{Any,Union{IndexedTable,%20NDSparse},Union{IndexedTable,%20NDSparse}}

I think you want something like:

join(t1, t2, lkey=(:id, :date), rkey=(:id, :date), lselect=:x1, rselect=(:y2, :y3))

You can leave out lkey and rkey if you already have primary keys set.

2 Likes

This looks good!
In addition, is it possible to apply “where” and “group by” clauses in the join? This would allow us to use filtered tables and calculate group-level statistics. For example, I am looking for a Julia counterpart for the following SQL command:

create table want as
select a.x1, b.y2, b.y3, mean(b.y2) as y2_mean
from t1 (where = (x2 > 0)) as a
left join t2 as b
on a.id = b.id and a.date = b.date
group by b.id
order by id, date