JuliaDB as a web app backend


#1

Recently I’ve spent some time learning about JuliaDB and I’m very excited about the theoretical possibility of having a high-performance web stack written fully in Julia – that is, by replacing the various traditional RDBMS backends with JuliaDB.

I was wondering if anybody experimented with JuliaDB in a high traffic scenario (large volume of small reads and writes). And if there’s any prospect of using it as a viable CRUD (Create Read Update Delete) backend.

I’m thinking for example about SQLite which is 35% faster than writing directly to the filesystem, due to some optimizations in opening and writing files. Where does JuliaDB stand in this regard?

Or, on the data read spectrum – MySQL employs strategies like SQL query result caching. A useful approach, given that web traffic will result in multiple requests against the same data (web apps are basically lists of things).

Using JuliaDB for the web would provide another solution for the two language problem, at database level. The traditional approach is to use some SQL frontend (Postgres or MySQL usually) for powering the web app. And then through an ETL process, move the data into a data warehouse or some other system for analytics. As far as I can tell only Cassandra had some success as a real-time + analytics backend, but it hasn’t seen widespread adoption. But a 100% Julia solution would have additional benefits, from setup and maintenance, to using just one language across the board.


#2

I don’t think something like JuliaDB is the right tool for CRUD type DBs. The whole columns oriented storage model is really targeted for analytics situations. I think for transaction oriented/CRUD style workloads you don’t want to go there. Plus, presumably if you start to think about writes, stuff like transactions, integrity etc. become important, which are all not supported by JuliaDB at all.


#3

I tend to agree with David. JuliaDB is more an analytical database than a transactional database. If you want small, atomic updates you should definitely use Postgres, SQLite, or MySQL. However if your updates are largish (e.g. merging a few MBs or more of new CSV data) and you want flexible analyses on largish datasets using Julia, JuliaDB is a good backed for that.


#4

JuliaDB or not, the idea of stitching it all inside Julia is super cool!


#5

Thanks, that’s what I thought too. Columnar databases are definitely not fit for CRUD operations - I didn’t realize JuliaDB is column oriented. Still, having a JuliaDB implementation for the web would be absolutely amazing - it would massively simplify development and deployments. Maybe somebody will consider a fork at some point.


#6

Well first of all I would say that relational databases are amazing pieces of technology and for a simple crud web application you should not shun them but embrace them and that includes SQL, because if you want to access the full power of the DB you have to write the sql yourself and not rely on an ORM . I know that most “modern” webframeworks now access the DB via an ORM and pretend you can just talk to the DB like its just an object model in memory but personally I think it gives more problems than it “solves”. But if you really want to pretend it’s all julia then the way to go is to write a julia ORM so you can ignore sql.

Having said that, for mixed workloads (OLTP + OLAP) where you just do not simply load data from the DB but can also do complex analytical operations (or just complex datawrangling that’s beyond current sql) directly in the DB (i.e. not in the DB client memory & possibly in parallel/multi-core) then you would need the julia analog of http://www.snappydata.io/ which is a fork of Spark. So indeed you would need a fork of juliaDB that also has good datastructures & operations for row-oriented data next to column-oriented data.


#7

For a real application, you’d likely need the data replicated, you’ll need transactions, etc.
There are also various good NoSQL database engines available, with C APIs, that it’s pretty easy to add a Julia wrapper to, such as Aerospike (which I wrote a wrapper for at the place I was consulting for before).
Don’t discount Postgres, I think that’s one of the better choices (we also used Postgres with Julia), and you can run it on the cloud (important if want to scale out your app).
Amazon is doing interesting things with MySQL and Postgres for AWS - AuroraDB).


#8

Thanks. I agree entirely. But that was not the point :slight_smile: I have been using MySQL and Postgres for many years (they’re awesome) and I have actually built on ORM for Julia (https://github.com/essenciary/SearchLight.jl).

My idea was to add JuliaDB as a backend to SearchLight, if it was a good fit. Setting up a local development environment or a production server, especially for a beginner, can be a daunting and time-consuming task. You need to install the correct version of your DB server, you need to set up proper access, you need to build the libraries to access it which many times ends up in failure, etcetera.

Having a pure Julia solution, which would install painlessly (“magically” for a beginner) and would provide excellent performance, would be dreamy. I’m thinking now that a RDBMS might not even be needed: object storage might work better. So maybe something like JuliaDB for metadata storage, combined with JLD2 for object storage.