Working with JDBC in Julia

GregorM · January 21, 2025, 5:06pm

In the last couple of years I have been using Julia for various data projects in different companies. What I’m always struggling with is getting all the needed data out of a database. In most cases databases can be accessed via JDBC and I have been using JDBC.jl. However I feel like I’m having issues with the package more often than not. Some examples:

In my previous job the main problem was, that there is a number of column types found in e.g. redhsift or trino db that aren’t converted into julia types even though those julia types exist (e.g. Arrays in Trino).
In my current project I’m trying to get data from redshift into a dataframe and as soon as the resulting table has more then 30 columns, the code in julia runs forever (The same query outside of julia takes only seconds).
If you don’t lookout and have two columns with the same name named returned by the db, JDBC.jl doesn’t rename those before creating the DataFrame but gives an error.

I know, that there is a workaround for all of this, but it makes live hard.

Maybe there is an alternative to JDBC.jl out there, that I don’t know. Maybe I’m just not smart enough to work with JDBC.jl properly. Maybe I’m the only one needing JDBC connection for their daily work.
Anyway maybe someone out there might be able to share some best practice, alternative packages, code snippets, … ?
Would be really appreciated.

wildusk · January 22, 2025, 10:37am

I experience similar issues with JDBC.jl

To avoid the “column limit” I defined the following function as a replacent for
load. It works for me, but I have no basic Java/JDBC knowledge.
I think somewhat like this could be merged into JDBC.jl

function load_wide(connection::String, stmt::String)
    csr = cursor(connection)
    execute!(csr, stmt)
    src = JDBC.Source(csr)
    rs = [row for row in rows(csr)]
    df = DataFrame()
    for i in 1:JDBC.ncols(src)
        df[!, JDBC.colname(src, i)] = [row[i] for row in rs]
    end

    return df
end

Topic		Replies	Views
[ANN] JDBC.jl Rejuvenated Data announcement , dataframes	3	1222	March 20, 2018
Difficulties with JDBC package New to Julia question , package	6	1803	April 15, 2019
Oracle DBI driver General Usage	38	5522	March 1, 2020
JuliaDB column names General Usage juliadb	4	1611	May 16, 2019
[ANN] SQLdf - SQL for Julia DataFrames Package Announcements package , announcement	26	2768	January 2, 2023

Working with JDBC in Julia

Related topics