[ANN] JDBC.jl Rejuvenated

announcement
dataframes

#1

Announcing JDBC.jl v0.3.0. This should provide you with a quick-and-easy way to get access to your databases with a Julia API almost immediately. It is very easy to get your data into a dataframe, for example

using JDBC, DataFrames
JDBC.usedriver("driver_file.jar") # add a JDBC driver
JDBC.init()  # initialize the JVM

# pull data into a DataFrame, or any other DataStreams Sink
df = JDBC.load(DataFrame, cursor("connection:string/here"), "select * limit 100")

I have to use a variety of different databases and query frameworks including Presto, Postgres and MS SQL and I found myself frustrated trying to get a solution up and running on multiple servers and local machines, as we are slightly lacking in database API’s in Julia right now. I stumbled accross JDBC.jl which was quite easy to use, but slightly in need of maintenance, so I threw on a new coat of paint. In my experience it has taken very little effort to get going. No thorough memory or performance benchmarking has been done, but I have had no problems with datasets up to \sim 10~\textrm{GB}.

It’s still recommended that ODBC.jl, or, if applicable, MySQL.jl or LibPQ.jl be your “ideal” choice for database interaction as they possess properly wrapped direct C API’s; but if you are using an incompatible database, or are struggling getting things set up properly, or just want to get some data quick with little hassle do check out JDBC.jl. Since this is a JDBC wrapper you can bet it supports just about any kind of database or query framework imaginable (though you will have to download the driver, which is usually quite simple).

This is also Julia 0.7 ready, though we still need to update JavaCall.jl. I’ve already looked through it and I think this will be straightforward, so I hope to update it well before the proper 0.7 release.


#2

Is there a way tho shut down the JVM after you’re done loading the data so it doesn’t sit there idle using up memory (the JVM min heap value) that could be used otherwise?


#3
using JavaCall
JavaCall.destroy()

Although note that you cannot restart the JVM after destroying it in the same process.


#4

On master, I have added a JDBC.destroy() function so that you don’t have to import JavaCall separately. I’ve also mentioned it in the documentation.

I’ll wait a week or two and watch for issues and suggestions and then tag an v0.3.1.