Julia in streaming - best practises?

Oliwia_Wojtkowska · March 22, 2017, 8:38am

Hi everyone ! My team is thinking about using Julia in streaming on production - we are curious what do you think about it - are there any best practises or examples of other companies using it in streaming? We want to put Julia between two Kafka topics and do some computation.

Thanks for advice !

dfdx · March 22, 2017, 1:40pm

Didn’t run it in production, but it should be pretty straightforward given that you:

Run as many workers as you have partitions and assign unique partition IDs manually.
Save offsets after each processed batch and restart from these offsets in case of failures.

Best practices of distributed applications also apply: health checker, logging, monitoring, etc. are still needed. It’s also worth to monitor lag between latest offset in a partition and current offset on a consumer side.

oatlzzvztd · March 22, 2017, 3:24pm

Is there a good general reference text of best practices for distributed applications?

dfdx · March 23, 2017, 8:18am

There are many different kinds of distributed systems, e.g.:

streaming applications, in which you are mostly concerned with 24x7 uptime and storing intermediate results;
batch applications that act rarely, but do a lot of work at once;
microservices, where you mostly think about good system decomposition reliability of each service;
MPI-like systems that give priority to the speed rather than reliability;
distributed databases whose main goals are data persistence and consistency, etc.

I don’t think there’s a single good reference text for all of them. From my experience it’s easier to start with one concrete distributed system / application and go from specific to more general tasks as needed.

Topic		Replies	Views
Anyone building a kafka streaming platform for Julia similar to Faust for Python? Community question , package	8	2338	March 25, 2022
State of distributed processing in Julia Julia at Scale	3	1636	May 14, 2019
Help find best approach to streaming data on raspberry pi General Usage	16	837	October 31, 2023
When will Julia compete with Spark? Julia at Scale announcement , spark	16	8678	June 5, 2021
Spark.jl and Kafka General Usage	0	45	September 21, 2024

Julia in streaming - best practises?

Related topics