ClickHouse database and Julia

Yifan_Liu · August 29, 2018, 11:24pm

For the last 3 years, I have been tortured by the slowness of Postgresql, and recently I came across this open source columnar database called clickhouse. The speed is unexpectedly fast, especially when I need to run a lot of queries.
I understand that Julians at this moment are still working on some traditional sql database APIs, but this clickhouse really deserves some attention.

https://hackernoon.com/clickhouse-an-analytics-database-for-the-21st-century-82d3828f79cc

Some benchmarks:
https://clickhouse.yandex/benchmark.html

ExpandingMan · August 29, 2018, 11:39pm

Hm, interesting. I’ve recently started to have to use Postgres more, and indeed I am getting nervous that it is going to be painfully slow.

It has a JDBC driver as well as an HTTP interface, so you can already use it with Julia if you really want to (should be quite trivial to use JDBC.jl for it). I don’t see docs here for the C or C++ API.

I was recently looking through some of the documentation of arangodb, but so many people around me seem to have a phobia and or hatred of non-tabular formats.

Yifan_Liu · August 29, 2018, 11:50pm

I haven’t used any Postgresql since I started using MonetDB, and I think MonetDB is fast enough for me. Then I tried clickhouse, it felt lightning fast though its sql dialect seems a little bit different.

Both Python and R now have API with MonetDB and clickhouse. I really think the tidyverse ecosystem in R makes a huge difference when it comes to using external database. It allows me to use dplyr syntax instead of having to learn a new sql dialect every time when I want to try a new sql database. The dplyr codes are translated into sql codes and the translated sql codes stay in the database and there is no performance loss. Hopefully, Julia will have an unified data science ecosystem like this.

scelles · June 5, 2019, 2:49pm

@ExpandingMan If you are nervous about Postgres being painfully slow and want to deal with time series maybe you should try TimescaleDB TimescaleDB vs. PostgreSQL for time-series

phelipe · June 5, 2019, 3:10pm

There is a package:

Topic		Replies	Views
Any fast database working with Julia? Data	3	954	November 25, 2018
Suggestions for relational databases Offtopic question , package , data_structures	18	1410	June 22, 2021
Lack of stable PostgreSQL-communication a major showstopper Data	33	4810	January 8, 2021
PostgreSQL.jl no longer works in v0.6 Web Stack	27	3481	March 26, 2018
Accessing Postgresql via Julia General Usage question	17	4533	November 30, 2019

ClickHouse database and Julia

Related topics