A serious data start-up structured around a Julia data manipulation framework for larger-than-RAM data

See also [ANN,RFC] DBCollections.jl – use Julia data manipulation functions for databases for a somewhat complementary direction. DBCollections allows using regular Julia operations on SQL databases, nice for big/out of memory/remote data.
Should be fully composable!
Some features are possible but not implemented yet (it’s just ~200 LOC) – such as joins or using arbitrary Julia UDFs.

It would be nice to understand and list specific limitations of such an approach of using SQL databases compared to querying pure-Julia structures, as @xiaodai plans. Some of them can realistically be possible to overcome indeed!

1 Like