[ANN] SQLdf - SQL for Julia DataFrames

Ok, here’s my proposed code:

module SQLiteDF
using SQLite
export @sqldf

macro sqldf(a...)
    db = gensym()
    return esc(quote
        $db = SQLite.DB();
        for sym in $(a[1:end-1])
            SQLite.load!(eval(sym),$db,String(sym))
        end;
        SQLite.DBInterface.execute($db,$(a[end]))
    end)
end

end

I haven’t actually turned it into a package yet that I can load by “using” but if I test it as follows, it seems to work:

using Pkg
Pkg.activate(".")
using DataFrames

include("SQLiteDF.jl")

a = DataFrame(foo=[1,2,3])
b = DataFrame(foo=[1,2,3],bar=[4,5,6])

print(DataFrame(SQLiteDF.@sqldf(a,b,"select a.*,b.bar from a join b on a.foo=b.foo")))

3×2 DataFrame
 Row │ foo    bar   
     │ Int64  Int64 
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     3      6


1 Like

Cool! I’ll try to integrate SQLite with the parser I have, thanks! :slightly_smiling_face:

1 Like

You could offer two versions, one where the user provides explicit table name symbols. One where you just provide a SQL query and it parses, extracts table names then calls the explicit version

With all due respects but I just don’t think passing tables names is a good idea, I understand you may believe parsing is a “design mistake” however I just happen to believe it is a good design.

Nonetheless I accept that I might be wrong, that’s why perhaps it is better at this point if you develop a package passing table names for you and for those that like you think that this is a good design, and on my side I’ll keep developing the parsing version in SQLdf for me and for those that like me prefer this approach. I hope this sounds good for everybody.

Good luck!

1 Like

No worries I’ll make my own package but I guess my point was if you want to use that code within your own implementation it would be fine.

My biggest concern about the SQL parser version is that it’s add a lot of technical debt and maintenance issues. I’m comfortable maintaining those few lines of macro but wouldn’t want the burden of maintaining a SQL parser myself.

It’s all right, I’ll check the SQLite.jl documentation to see what’s new and what’s the best way to integrate it in SQLdf. Last year when I tried SQLite.jl I think I had some issues and I just moved to RCall/sqldf, now it seems it’s working better and I should have had a look at it before integrating SQLdf with RCall/sqldf.

All is good though, I will maintain the parser in my version, good luck again!

Hi @viraltux , I’m learning DataFrames basics, and a SQL-ish translation layer immediately came to mind. I do see that you haven’t updated the code in awhile. Has there been much uptake or any efforts to standardize this or something similar? Cheers!