ODBC query is slow because of casting to a GUID type


#1

I’m reading in some moderately-sized tables (10^310^6 rows) from a PostgreSQL database. Because PostgreSQL.jl seems to be dead on v0.6, I’ve had to resort to installing the Postgres ODBC driver and using ODBC.jl. I’m pretty new to this database stuff, so getting this far is certainly an accomplishment!

The problem is that queries through ODBC.query(...) run slowly—somewhere around an order of magnitude slower than just running the same query through the psql command line. I tried profiling the query and it looks like the bottleneck is here (can’t figure out how to get it print the full function signature, annoyingly):

Count File                        Line Function
 2191 ...v0.6/ODBC/src/Source.jl   198 cast!(::Type{ODBC.API.SQLGUID}, ::O...

which appears to be where it’s converting things from a Postgres GUID type into something else.

Can I speed up this slow type conversion? I know it can be done quickly because I’m recreating an R interface to the database, and that interface casts them all to strings with very little overhead :slight_smile:


#2

Can’t you just change your SQL query? Something like:

ODBC.query(dsn, "select guid::varchar, .... from table where ....")

#3

After doing that, it looks like most of the time is still being spent in various cast! methods, including casting to Float64 and WeakRefString (which appears to be the Julia translation of a varchar). Perhaps I should take a look at PostgreSQL.jl and see what’s preventing that from working on v0.6.