What package to use for simple SQL query in different types of database

Laco_Kovac · March 24, 2021, 2:01pm

I need to execute simple SQL (e.g. SELECT * FROM table) and fetch the returned data, but the database could be 1 of 4 types: PostgreSQL, MySQL, MariaDB and MS SQL Server.

Ideally, I would like to use a package which deal with such database types using unified interface.

I saw an example of Python code using SQLAlchemy Python toolkit for roughly the same task.

There used to be Julia wrapper for that, but it seems to me that it is not maintained and doesn’t work any more. I saw also stackoverflow topic about PyCall, but I would like to avoid that.

What simple package I could use for multi-database connection & simple SQL?

Thank you.

Skoffer · March 24, 2021, 2:10pm

There is an Octo.jl which can be useful.

Also, if all you need is to execute simple queries which are known beforehand and they do not differ between databases, you can use DBInterface (connections strings are fake of course, you should refer to corresponding manuals).

using ODBC
using DBInterface

pg_conn = ODBC.Connection("Driver=postgresw.so;User=abc;Password=123")
mysql_conn = ODBC.Connection("Driver=mysql.so;User=abc;Password=123")

pg_res = DBInterface.execute(pg_conn, "SELECT * FROM tbl1")
mysql_res = DBInterface.execute(mysql_conn, "SELECT * FROM tbl1")

If you know the resulting data structure beforehand, you can use Strapping.jl to convert the result to a needed structure with Strapping.construct.

There is no package that I am aware of, but you can easily utilize multiple dispatch to do something like this

abstract type AbstractDB end
struct Postgres{T} <: AbstractDB 
  conn::T
end
struct MySQL{T} <: AbstarctDB 
  conn::T
end

execute(db::AbstractDB, query) = DBInterface.execute(db.conn, query)

pg_db = Postgres(ODBC.Connection("Driver=postgres.so"))
my_db = MySQL(ODBC.Connection("Driver=mysql.so"))

function get_data(db)
  execute(db, "SELECT * FROM tbl1")
end

function get_data(db::Postgres)
  execute(db, "SELECT * FROM myschema.tbl2")
end

Of course, it’s not exactly `SQLAlchemy" because it do not generate sql queries. It’s a totally different thing, I am afraid.

TheCedarPrince · March 24, 2021, 2:39pm

I think @cce is working on a package that might work for what you are thinking of - was it DataKnots.jl Clark? I cannot recall…

Skoffer · March 24, 2021, 2:58pm

I have found https://github.com/MechanicalRabbit/DataKnots.jl and https://github.com/MechanicalRabbit/DataKnots4FHIR.jl, but it looks like they do not have database functionality yet.

lungben · March 24, 2021, 3:23pm

Maybe https://github.com/GenieFramework/SearchLight.jl, but I have not tried it myself yet.

Skoffer · March 24, 2021, 3:30pm

In their docs they say “At the moment SearchLight does not work outside Genie.”

quinnj · March 25, 2021, 2:38pm

Yeah, using DBInterface.jl is the closest thing we have right now, but it doesn’t do any translation of queries between SQL dialects. If all your queries will be SQL 92 compliant, then you should be fine; but if it’s some kind of more general query executor, I’m afraid Julia doesn’t have fancy sql dialect translation yet.

cce · March 27, 2021, 1:23pm

We’re working on this currently, we’re going to call it FunSQL.

https://github.com/MechanicalRabbit/FunSQL.jl

Here is a prototype cohort query against an OHDSI database…

https://github.com/MechanicalRabbit/ohdsi-synpuf-demo/blob/master/cohorts/1770674.jl#L101

This uses a prototype implementation

https://github.com/MechanicalRabbit/FunSQL.jl/blob/prototype/src/FunSQL.jl

We’ve got the semantics the way we want. The next steps are to get a v.1 release out with bare minimal functionality (this is now master). Then we’ll get SQLite/PgSQL to work as alternative dialects (v.2). Then we’ll start to add SQL functions and their translations. Documentation will follow shortly, as will comprehensive examples using OHDSI. We’ll be spending next 6+ months making this smooth. We’ll be submitting a Julia Con talk for the SQL construction technique. It’s rather novel – it makes SQL construction modular, which makes it “fun”.

brett_knoss · September 27, 2021, 3:53am

When is it more reliable to use SQL vs DataFrames? Currently I have DataFrames nested in dictionaries, and it seems to work, but I’m not sure that this is the most efficient solution.

I also use an XLSX to save initial conditions, and to transfer those to another file.

ImreSamu · September 27, 2021, 4:47am

IMHO: no silver rule.

My preference for SQL:

“handling data larger than fits into Client RAM”
I want to move the “business logic process” to the SQL server
Complex business logic: expected to heavy performance tunning on the Server side
Complex business logic: with some special data domain
- PostGis “PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects allowing location queries to be run in SQL.”
- pgRouting “pgRouting extends the PostGIS / PostgreSQL geospatial database to provide geospatial routing functionality.”
- MobilityDB “MobilityDB is implemented as an extension to PostgreSQL and PostGIS. It implements persistent database types, and query operations for managing geospatial trajectories and their time-varying properties.”

If you want to use Julia on the SQL side …

https://gitlab.com/pljulia/pljulia “PL/Julia Procedural Language Handler for PostgreSQL”

brett_knoss · September 27, 2021, 4:49am

Could you elaborate on complex business logic?

ImreSamu · September 27, 2021, 5:37am

IMHO:

not a simple “Select” … AND hard to reimplement and frequently change in “all client” languages. ( Javascript,Julia, Python) ( + testing + version controll )

example:

this is a simple business logic ( random github search results )
not so hard to reimplement is julia - but hard to create “Synchronized Implementation” in multiple languages ( SQL + Julia + Javascipt ) … so the best practice … create a primary version in SQL … and call from Julia/Python/Javascript/…

github.com

iuri/polrn/blob/e27bbbded4b663d5188ad43c32b03f37c67b77c3/packages/intranet-cost/sql/postgresql/intranet-cost-create.sql#L647-L660


      
          	IF v_provider_country_code in (
          			'ad', 'at', 'be', 'bg', 'cy', 'cz', 'de', 'dk', 
          			'ee', 'es', 'fi', 'fr', 'gr', 'hr', 'hu', 'ie', 
          			'it', 'li', 'lu', 'mt', 'nl', 'no', 'pl', 'pt', 
          			'ro', 'se', 'si', 'sk', 'uk') 
          		THEN v_provider_eu_p := 1;
          		ELSE v_provider_eu_p := 0;
          	END IF;
          	RAISE NOTICE 'im_cost_vat_type_from_cost_id: v_provider_eu_p=%', v_provider_eu_p;
          
          	IF v_cost_type_id in (3700,3702,3730,3732)
          		THEN v_cost_is_invoice_or_quote_p := 1;
          		ELSE v_cost_is_invoice_or_quote_p := 0;
          	END IF;

so this type of code fragments is part of the business logic

v_cost_type_id in (3700,3702,3730,3732)

example 2

IF you have to create a complex CASE/WHEN or IF/THEN/ELSE codes …
THEN this is probably a business logic.

github.com

iuri/polrn/blob/e27bbbded4b663d5188ad43c32b03f37c67b77c3/packages/intranet-cost/sql/postgresql/intranet-cost-create.sql#L370-L392


      
          create or replace view im_cost_types as
          select	category_id as cost_type_id, 
          	category as cost_type,
          	CASE 
          	    WHEN category_id = 3700 THEN 'fi_read_invoices'
          	    WHEN category_id = 3702 THEN 'fi_read_quotes'
          	    WHEN category_id = 3704 THEN 'fi_read_bills'
          	    WHEN category_id = 3706 THEN 'fi_read_pos'
          	    WHEN category_id = 3716 THEN 'fi_read_repeatings'
          	    WHEN category_id = 3718 THEN 'fi_read_timesheets'
          	    WHEN category_id = 3720 THEN 'fi_read_expense_items'
          	    WHEN category_id = 3722 THEN 'fi_read_expense_bundles'
          	    WHEN category_id = 3724 THEN 'fi_read_delivery_notes'
          	    WHEN category_id = 3730 THEN 'fi_read_interco_invoices'
                      WHEN category_id = 3732 THEN 'fi_read_interco_quotes'
          	    ELSE 'fi_read_all'
          	END as read_privilege,
          	CASE 
          	    WHEN category_id = 3700 THEN 'fi_write_invoices'
          	    WHEN category_id = 3702 THEN 'fi_write_quotes'

This file has been truncated. show original

ImreSamu · September 27, 2021, 5:45am

business logic 2.

a little complex example:

a normal select but …

with multiple CASE WHEN
multiple input table ( LEFT JOIN , INNER JOIN )

https://github.com/paulalcabasa/Oracle-SQL-Files/blob/e6d3dd27ce55179ae6834c667eebfedf9caedc50/Finance%20System/VAT%20MONITORING%20QUERY.sql#L123-L183

drizk1 · June 14, 2024, 12:18pm

I know this is a few years late, but TidierDB.jl will give you one front end for the 4 databases you mentioned in addition to Oracle, DuckDB, Clickhouse, AWS Athena, Snowflake and Google Big Query. Your query is written with Tidier.jl syntax and then converted to the appropriate sql and executed on the selected backend.

Topic		Replies	Views
Easiest and most complete package for PostgreSQL right now (Feb 2022)? General Usage postgresql	48	7182	July 11, 2025
Lack of stable PostgreSQL-communication a major showstopper Data	33	4815	January 8, 2021
Database backed for DataFrames Data dataframes , sql	6	835	May 4, 2024
[ANN] DBInterface.jl, MySQL.jl 1.0, and SQLite.jl 1.0 Data	5	2461	February 28, 2020
Status of database packages and tools for, e.g. ORM tools General Usage	1	1194	November 8, 2022

What package to use for simple SQL query in different types of database

example:

example 2

Related topics