JuliaDB filter: select one of multiple rows

MaximilianJHuber · March 9, 2018, 7:10pm

I have a table with three keys: NR=Int64, DT1=Date, DT2=Date, and DT2 is always before DT1.

I want to filter the table with two conditions:

DT1=Date(2017, 3, 31)
choose the newest by DT2 (i.e. closest to DT1) of possibly multiple entries that have the same NR and DT1.

Very similar to a question on StackExchange, just that I want NR to be the group-by variable.

While this is easily formulated as a SQL query, my implementation in JuliaDB is complicated and not exaclty fast.
I filter the correct date, then figure out the correct DT2 per NR, cast it into a Dictionary, and finally filter the entries by the correct DT2:

select_date = filter(d -> d == Date(2017, 3, 31), data_table, select = :DT1)
correct_DT2 = collect(groupby(@NT(DT2=d -> maximum(d)), select_date , :NR, select=:DT2))

dict = Dict{Int64, Date}()
for i in 1:length(correct_DT2)
   dict[correct_DT2[i].NR] = correct_DT2[i].DT2
end

filter(row -> row.DT2 == dict[row.NR], select_date, select = (:NR, :DT2))

It seems to me that the filter function has less functionality than SQL, for example the GROUP BY qualifier.

Any thought on how to do it better?

Topic		Replies	Views
JuliaDB groupby many columns General Usage juliadb	0	361	April 9, 2021
JuliaDB: select columns from 2 tables and join based on 2 keys General Usage	3	655	August 17, 2020
Query newest entry with Query.jl Data question	4	437	July 22, 2020
JuliaDB delete row from table New to Julia juliadb	4	910	April 15, 2019
JuliaDB select/filter multiple string values in one column Data question , package	10	2510	February 11, 2022

JuliaDB filter: select one of multiple rows

Related topics