What package to use for simple SQL query in different types of database

I need to execute simple SQL (e.g. SELECT * FROM table) and fetch the returned data, but the database could be 1 of 4 types: PostgreSQL, MySQL, MariaDB and MS SQL Server.

Ideally, I would like to use a package which deal with such database types using unified interface.

I saw an example of Python code using SQLAlchemy Python toolkit for roughly the same task.

There used to be Julia wrapper for that, but it seems to me that it is not maintained and doesn’t work any more. I saw also stackoverflow topic about PyCall, but I would like to avoid that.

What simple package I could use for multi-database connection & simple SQL?

Thank you.

There is an Octo.jl which can be useful.

Also, if all you need is to execute simple queries which are known beforehand and they do not differ between databases, you can use DBInterface (connections strings are fake of course, you should refer to corresponding manuals).

using ODBC
using DBInterface

pg_conn = ODBC.Connection("Driver=postgresw.so;User=abc;Password=123")
mysql_conn = ODBC.Connection("Driver=mysql.so;User=abc;Password=123")

pg_res = DBInterface.execute(pg_conn, "SELECT * FROM tbl1")
mysql_res = DBInterface.execute(mysql_conn, "SELECT * FROM tbl1")

If you know the resulting data structure beforehand, you can use Strapping.jl to convert the result to a needed structure with Strapping.construct.

There is no package that I am aware of, but you can easily utilize multiple dispatch to do something like this

abstract type AbstractDB end
struct Postgres{T} <: AbstractDB 
  conn::T
end
struct MySQL{T} <: AbstarctDB 
  conn::T
end

execute(db::AbstractDB, query) = DBInterface.execute(db.conn, query)

pg_db = Postgres(ODBC.Connection("Driver=postgres.so"))
my_db = MySQL(ODBC.Connection("Driver=mysql.so"))

function get_data(db)
  execute(db, "SELECT * FROM tbl1")
end

function get_data(db::Postgres)
  execute(db, "SELECT * FROM myschema.tbl2")
end

Of course, it’s not exactly `SQLAlchemy" because it do not generate sql queries. It’s a totally different thing, I am afraid.

4 Likes

I think @cce is working on a package that might work for what you are thinking of - was it DataKnots.jl Clark? I cannot recall…

I have found GitHub - MechanicalRabbit/DataKnots.jl: an extensible, practical and coherent algebra of query combinators and GitHub - MechanicalRabbit/DataKnots4FHIR.jl: A DataKnots adapter for FHIR Data Sources, but it looks like they do not have database functionality yet.

Maybe GitHub - GenieFramework/SearchLight.jl: ORM layer for Genie.jl, the highly productive Julia web framework, but I have not tried it myself yet.

In their docs they say “At the moment SearchLight does not work outside Genie.”

Yeah, using DBInterface.jl is the closest thing we have right now, but it doesn’t do any translation of queries between SQL dialects. If all your queries will be SQL 92 compliant, then you should be fine; but if it’s some kind of more general query executor, I’m afraid Julia doesn’t have fancy sql dialect translation yet.

We’re working on this currently, we’re going to call it FunSQL.

Here is a prototype cohort query against an OHDSI database…

This uses a prototype implementation

We’ve got the semantics the way we want. The next steps are to get a v.1 release out with bare minimal functionality (this is now master). Then we’ll get SQLite/PgSQL to work as alternative dialects (v.2). Then we’ll start to add SQL functions and their translations. Documentation will follow shortly, as will comprehensive examples using OHDSI. We’ll be spending next 6+ months making this smooth. We’ll be submitting a Julia Con talk for the SQL construction technique. It’s rather novel – it makes SQL construction modular, which makes it “fun”.

7 Likes

When is it more reliable to use SQL vs DataFrames? Currently I have DataFrames nested in dictionaries, and it seems to work, but I’m not sure that this is the most efficient solution.

I also use an XLSX to save initial conditions, and to transfer those to another file.

IMHO: no silver rule.

My preference for SQL:

  • “handling data larger than fits into Client RAM”
  • I want to move the “business logic process” to the SQL server
  • Complex business logic: expected to heavy performance tunning on the Server side
  • Complex business logic: with some special data domain
    • PostGis “PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects allowing location queries to be run in SQL.”
    • pgRouting “pgRouting extends the PostGIS / PostgreSQL geospatial database to provide geospatial routing functionality.”
    • MobilityDB “MobilityDB is implemented as an extension to PostgreSQL and PostGIS. It implements persistent database types, and query operations for managing geospatial trajectories and their time-varying properties.”

If you want to use Julia on the SQL side …

Could you elaborate on complex business logic?

IMHO:

  • not a simple “Select” … AND hard to reimplement and frequently change in “all client” languages. ( Javascript,Julia, Python) ( + testing + version controll )

example:

this is a simple business logic ( random github search results )
not so hard to reimplement is julia - but hard to create “Synchronized Implementation” in multiple languages ( SQL + Julia + Javascipt ) … so the best practice … create a primary version in SQL … and call from Julia/Python/Javascript/…

so this type of code fragments is part of the business logic

  • v_cost_type_id in (3700,3702,3730,3732)

example 2

IF you have to create a complex CASE/WHEN or IF/THEN/ELSE codes …
THEN this is probably a business logic.

1 Like

business logic 2.

a little complex example:

a normal select but …

  • with multiple CASE WHEN
  • multiple input table ( LEFT JOIN , INNER JOIN )