What is the most straightforward, easiest and most complete package to use for PostgreSQL?
This may pertain to documentation quality too.
More particularly, it should allow me to also parse SQL/PostgreSQL syntax embedded. For example,
@sql select * from somedatabase #etc
When I searched this forum, there were a number of threads, and online I found information here and there, but it’s not clear to me what I should choose, keeping those criteria in mind.
I think I saw a few main options: ODBC.jl, LibPQ.jl, PostgreSQL.jl. Maybe Octo.jl. Can you clarify, keeping those criteria in mind?
If you just need to talk to Postgres and are happy writing raw SQL code, LibPQ.jl is the library to use. This is fairly low-level and you’ll need to write SQL code as strings within your Julia source code. If you need to talk to many different databases, ODBC.jl is a good option to try.
For higher level user interfaces you’ve got several options. If you’re happy working directly with relational algebra and tabular data:
Octo.jl is a DSL for psuedo-SQL embedded within Julia code. I haven’t tried this, but the examples look nice.
FunSQL.jl is a composable way of writing queries as Julia code in pipelined form. I really like this approach and syntax, but again, haven’t tried it for a serious project yet.
If you just want to write SQL code but with safe string interpolations, my library SQLStrings.jl is a minimal approach to avoiding SQL string escaping problems. It can be combined with LibPQ or other SQL libraries quite easily.
If you want your database tables to be automatically mapped into Julia structs I think SearchLight.jl from the Genie framework can do this. There’s probably other options too, for example using combining StructTypes.jl together with some utility code.
Likely I’ve missed out on some tools here. Perhaps others can add extra suggestions.
I vote for FunSQL.jl + LibPQ.jl.
I use FunSQL to create queries in a generic way utilizing the package’s Julia-based DSL for writing generic and composable queries.
Then, I render the FunSQL query to correct PostgreSQL syntax.
Finally, I execute this rendered SQL using LibPQ.jl.
The documentation is top-notch, the examples there are great for learning, the project is continuing to be developed by @xitology and @cce (see them on the Julia Zulip here: Zulip), and the underlying API is getting close to stabilizing to a 1.0 release.
I am using it for serious projects and I am beyond impressed with its flexibility and generality.
It is a key component of my workflow.
Thanks, @c42f. Very much appreciated. I had the impression LibPQ.jl did not parse SQL syntax, according to its github page. Did I read this wrongly, under their heading “Non-Goals” or did I misunderstand something? To be honest, I don’t know much about LibPQ, and LibPQ.jl doesn’t seem to have any documentation. PostgreSQL’s docs about LibPQ are not very helpful to me right now, I think.
@TheCedarPrince, thanks again. FunSQL.jl caught my eye before, and it looks very good, but I am really looking for straightforward SQL parsing as strings without too much fuss of having to install all sorts of things. Tell you the truth, I’m using raw SQL to get used to its syntax, but psql’s command line program is not very quick for me, so I was thinking to use raw SQL embedded in Julia for now.
LibPQ itself doesn’t parse SQL syntax, but this is unnecessary because the postgres server does the parsing.
postgres gives reasonably good error messages, so this really isn’t a problem. For example:
julia> conn = LibPQ.Connection("postgres://postgres:$password@localhost:5432/postgres");
julia> LibPQ.execute(conn, "select blah where")
ERROR: SyntaxError: ERROR: syntax error at end of input
LINE 1: select blah where
^
When using SQL as strings embedded in your program, never use string interpolations. The README in SQLStrings.jl describes why not and how to fix this:
To expand the description there a bit, suppose you had the following query with unsanitized user input taken from some function get_data_from_user():
name, age, class = get_data_from_user()
query = "INSERT INTO Students VALUES ('$name', $age, '$class')"
LibPQ.execute(connection, query);
Now suppose the user enters their name as the string
name = "Robert'); DROP TABLE Students; --"
along with age = 15 and class = "A"
Now the query becomes (with newlines added for clarity)
INSERT INTO Students VALUES ('Robert');
DROP TABLE Students;
--', 15, 'A')
Hey presto! The Students table is now gone from the database.
SQLStrings exists to solve this problem by letting you write almost the exact same syntax but safely:
query = sql`INSERT INTO Students VALUES ($name, $age, $class)`
[Edit: Note the lack of quoting around $name in the sql`...` version. This is easier to use and more correct.]
I am also using some database tools for data discovery / testing connections / pre-testing SQL queries / checking table names , columns / visualisating geometries. My favorite:
Yes there’s many aspects to security. It’s hard to provide firm guidance without knowing @Ouji’s use case better.
But given the nature of the questions here I assume they’re not a DB administrator, rather using the DB as a client?
From a DB client perspective
If you’re writing an application, beware of data sanitization and SQL injection. Any of the above user-level tools should manage this for you (As appropriate for a low-level library, LibPQ expects you to manage this yourself).
If you’re using the DB directly as an analyst and not exposing your code to input from the outside, you’ve got nothing much to worry about. (Other than the obvious - protect your password etc.)
BTW, I should point out this isn’t just for security. It helps with correctness too by avoiding the need to quote the interpolations in your queries.
@c42f@ImreSamu Thank you. I am not a database administrator. My use for now is really general and mostly about just using raw SQL syntax in Julia as directly and as simply as possible, almost as though I’d be using psql directly.
The main reason for now to use Julia with SQL instead of some graphical interface or psql, is convenience. I like Julia’s terminal program much more than psql, and I intend to use Julia and SQL together at some point anyway.
@TheCedarPrince Heh. Are there any other jokes to be found like that?
I think I probably wouldn’t be using any string interpolations for the time being, but it’s good that you mentioned it.
I think it was you who mentioned that your SQLStrings.jl can be easily combined with LibPQ.jl. So in my case, seeing that I merely want to use raw SQL as directly and as simply as possible in Julia for now to get more used to SQL, what do you ultimately recommend? Should I just go with LibPQ.jl, and add your SQLStrings.jl later?
What about ODBC.jl? I went through their documentation, and it seemed simple enough to set up and use, but I fear it would be trouble. Overall, how does ODBC.jl compare to LibPQ.jl in terms of easiness, straightforwardness, and documentation?
@c42f I had some trouble with LibPQ.jl actually. Also, from what I’ve seen, there is hardly any documentation for it. I checked their docs, and I managed to finally connect. Would it be too much to ask for a simple example or examples showing how to initially connect, say, bob who would be a data analyst and only user?
Another thing is I don’t know how to execute the simple common commands using \ which is straightforward in raw Postgres. (For example, \d to describe something.) I thought about using an escape character, but I haven’t yet come to it. So far it’s been impractical for me.
Even another thing is that if you pass a query into a DataFrame, the data in rows and columns (especially a String) is not shown fully horizontally, if long. How can I force that long data isn’t cut off, or even that it’s wrapped? Are there any handy functions already available?
ASK the connection info from the Database Administrator
test the connection info with psql command line tool - because the SSL connection has a lot of complexity - or some info from the Administrator is obsolete or wrong …
Thank you for your long and informative post. Very much appreciated.
Yes, I’d seen that about dbname and user, and a password was asked, and I managed to connect. I think the issues left are more of a practical kind. I asked for the example earlier, because somehow his example didn’t work at all for me.
Those environment variables you mention are pretty nice. I didn’t notice them before in the Postgres docs.
I think what’s left at least for now is those two other issues, but they probably don’t belong in this thread.
Maybe my approach to choosing to use SQL and Julia together has been wrong. Maybe I should’ve thought more about it before I went for LibPQ. So I’ll probably make a new and more general thread about that, which would also include which editor/notebook software to use.
I’ve written a script that generates Julia structs that map to DB table record it’s pretty straightforward to do with just manipulating expressions once you have all the information, which types are nullable, etc. It’s for work so I can’t share it unfortunately.
FunSQL.jl looks super useful as well but I haven’t gotten to integrating it with the current setup.