Accessing Postgresql via Julia

Yifan_Liu · November 12, 2017, 8:05pm

I am currently using R to access a remote Postgresql database, and my code looks like those as below:

library(RPostgres)
wrds <- dbConnect(Postgres(), 
                  host='wrds-pgdata.wharton.upenn.edu',
                  port=9737,
                  user='username',
                  password='password',
                  sslmode='require',
                  dbname='wrds')

I would like to know how to do it in Julia.

johann.spies · November 13, 2017, 6:32am

Have a look at this thread where people helped me a lot:
https://discourse.julialang.org/t/lack-of-stable-postgresql-communication-a-major-showstopper/

quinnj · November 13, 2017, 4:19pm

ODBC.jl provides access to any database that supports an ODBC driver; Postgres has excellent support for ODBC. For windows, I’d downloads the msi folder version for an easy installer, for other unix platforms, just grab the tarball from the src directory. You can setup an ODBC dsn file like the one used for testing, which can be registered w/ your system ODBC via the command odbcinst -i -s -h -f ./test/pgtest.odbc.ini.

As noted elsewhere, it takes a bit more setup, but once you’ve been thru it once or twice, I find it much easier to setup any other database.

ExpandingMan · November 13, 2017, 4:29pm

By the way, on ODBC master on 0.6.1 I’m getting

ERROR: LoadError: LoadError: LoadError: UndefVarError: @isdefined not defined
Stacktrace:
 [1] include_from_node1(::String) at ./loading.jl:576
 [2] include(::String) at ./sysimg.jl:14
 [3] include_from_node1(::String) at ./loading.jl:576
 [4] include(::String) at ./sysimg.jl:14
 [5] include_from_node1(::String) at ./loading.jl:576
 [6] eval(::Module, ::Any) at ./boot.jl:235
 [7] _require(::Symbol) at ./loading.jl:490
 [8] require(::Symbol) at ./loading.jl:405

Yifan_Liu · February 7, 2018, 2:14am

Just want to see if there is any update on this

Nosferican · November 29, 2019, 9:27am

using LibPQ: Connection
conn = Connection("""host = wrds-pgdata.wharton.upenn.edu port = port
                     user='username' password='password'
                     sslmode = 'require' dbname = wrds
                  """)

Yifan_Liu · November 30, 2019, 5:43am

when I tried to install LibPQ on Julia 1.2, I got the error message:

ERROR: Pkg.TOML.ParserError(275, 278, “redefinition of table 3.0”)

…and 1 more exception(s).

Nosferican · November 30, 2019, 5:56am

There seem to be an issue with your active environment Project.toml. Use a new environment for that project or alternatively, clean up that Project.toml to figure out the bas entries.

Yifan_Liu · November 30, 2019, 6:07am

how to clean up the Project.toml

Nosferican · November 30, 2019, 6:27am

You can just delete it (rename it to _Project.toml if you want to keep a copy) and create a new one by

using Pkg
Pkg.activate("path/to/prj/dir")
Pkg.add("MyFirstDep")

That should do it.

Yifan_Liu · November 30, 2019, 6:54am

Thank you. I got another problem

res = execute(wrds, “select column_name
from information_schema.columns
where table_schema=‘crsp’
and table_name=‘msf’
order by column_name”)

crsp_msf_var = DataFrame(columntable(res))

i need to specify the number of rows to be downloaded. sometimes the data could be too large, and i just need to see a part of it. for example, something like this:

crsp_msf_var = DataFrame(columntable(res, nrow = 10))

would be great

Nosferican · November 30, 2019, 7:24am

You can do so directly in SQL by adding

res = execute(wrds, “select column_name
from information_schema.columns
where table_schema=‘crsp’
and table_name=‘msf’
order by column_name
limit 10”)

Yifan_Liu · November 30, 2019, 7:31am

thank you so much! this may be unrelated to this post, just want to know if DataFrame is optimized for good performance or if there is any other fast data structure for tabular data in Julia.

Nosferican · November 30, 2019, 7:40am

DataFrames.jl is one of the oldest packages in the data ecosystem and is currently the flagship when it comes to in-memory tabular data. It is quite efficient and has tons of features. However, for efficiency the optimal course would be to run operations in the performant database when possible (e.g., a relational database such as Postgres with relevant indices). For example, run any select/filter/joins/sort operations in the database when possible and work with the results with DataFrames. Same advice applies to most tabular ecosystem no matter how efficient those are (e.g., same applies to R’s data.table).

Yifan_Liu · November 30, 2019, 7:57am

Is there a a package in julia that is similar to data.base?

Nosferican · November 30, 2019, 9:37am

What do you mean by that?

Yifan_Liu · November 30, 2019, 1:55pm

I meam data.table, sorry sbout the typo

Nosferican · November 30, 2019, 8:26pm

In general DataFrames.jl offers efficient in-place operations à la data.table. As for indices, IndexTables.jl does a bit of that.

Topic		Replies	Views
How to get ODBC to work with postgresql Data	1	1918	April 20, 2017
Lack of stable PostgreSQL-communication a major showstopper Data	33	4822	January 8, 2021
What is the currently recommended way to access a postgres database? Data data , postgresql , sql , database	3	1318	February 23, 2022
Easiest and most complete package for PostgreSQL right now (Feb 2022)? General Usage postgresql	48	7203	July 11, 2025
Suggestions for relational databases Offtopic question , package , data_structures	18	1417	June 22, 2021

Accessing Postgresql via Julia

Related topics