I’m very pleased to announce the new package Rembus.jl.
Rembus provide RPC and Pub/Sub communication styles to streamline the implementation of distributed applications.
There are a lot of features built into Rembus, actually what’s missing are user guides and API documentation - they’re on the roadmap.
In the meantime, let me show how cumbersome is to implement a RPC service with Rembus.
Start a broker:
julia -e "using Rembus; caronte()"
In a script or in a REPL implements a RCP service that returns a DataFrame:
# rpc_service.jl:
using Rembus
using DataFrame
function build_dataframe(ncols=5)
df = DataFrame(name=["point_$i" for i in 1:ncols], x=1:ncols, y=rand(ncols))
return df
end
@expose build_dataframe
forever()
To invoke the build_dataframe RPC method in another REPL execute:
using Rembus
df = @rpc build_dataframe(10)
Pub/Sub style follows the same pattern, see Rembus README for an example.
I think that Rembus could be a great addition to Julia ecosystem. I’m not aware of a framework that beats Rembus in semplicity and that offers both of the worlds: RPC and Pub/Sub communication styles.
But I’m obviously quite biased about that, the Community response will be the true judge.
The goal of Rembus is to implements distributed applications using either PubSub and RPC communication patterns with a simple and thin API and provide a builtin fault-tolerant features regarding the connection management.
Rembus is a very young framework, the main focus was about designing the most natural and simple API possible, there are not yet benchmarks that compare Rembus with other well known solutions but I may say that it is designed to be used in an enterprise context.
About the zero-copy feature, CBOR is pretty smart in reducing memory usage at encoding/decoding but actually memory allocation happens: lets see if this is a real concern, it could be an evolution worth considering.
At this stage would be very presumptuous compare Rembus with mature and battle tested frameworks but just to get a partial comparison with some well known tecnologies see the below table.
Please note that gRPC and Kafka are products frameworks whereas MQTT and JSON-RPC are protocol specifications.
feature
gRPC
Kafka
JSON-RPC
MQTT
Rembus
publish/subscribe
no
yes
no
yes
yes
RPC: method call on remote machine
yes
no
yes
no
yes
binary data
yes
yes
no, Base64 encoding
yes
yes
built-in DataFrames support
no
no
no
no
yes
Rembus.jl is the Julia version of the Rembus protocol but the roadmap is to have the API implemented for other languages.
The next API implementation will be a Python package for Rembus and then a Rembus Rust crate will follow.
Dagger.jl is a scheduler for distributing and coordinatating parallel computations between processes and threads, using CPUs and GPUs.
I don’t know Dagger, skimming the docs the computations are performed by workers that are written just in Julia.
Rembus.jl is a middleware for distributing structured messages and coordinate processes using two communication patterns: RPC and Pub/Sub,
Its goal is to enable a microservice architecture: with CBOR and WebSocket as protocols it will be possible to implement a distributed system using the languages that will implements the Rembus specification.
RPC is a one to one Request/Response pattern tipically used when a process implements a service and the RPC clients expect a response after invoking it.
There could be variants about this basic feature, for example Rembus may broadcast the request to all interested processes if the request returns a response and not an error. Probably such feature may sound quite unconventional but it for example quite useful for semi real-time synchronization of distributed applications when a RPC command has a side effect that needs propagation.
Also the one-to-one RPC link is not a fixed one between client and server but being intermediated by a broker there could be many RPC servers exposing the same service and the request routed using first-up, round-robin or less-busy logic.
Pub/Sub is a one-to-many pattern where one process produce a message and zero, one or more interested processes receive at the same time such message, and the Pub/Sub pattern may be further customized, for example providing a “retroactive” feature when a process want to receive published messages when it was offline instead of receiving only messages that are published after it connects to the broker.
Being slighty more serious Visor.jl is interesting. When exascale clusters were first discussed the thinking was that the individual servers would never stay up long enough to complete a simulation (ie the hardware was not reliable enough). The concept of ‘Flock of Birds’ computing was mooted, where you launch an excess amount of compute processes and expect some to be shot down. I do not know what happened to that.
Looks like those fears have not been realised.
I was also recently reading up about fault tolerance in MPI
BTW Function As A Service (FASS). You saw it here first.
I never approched MPI and HPC, I think fault-tolerance in this domain deserves specific implementation.
Visor.jl is geared toward supervisiong long-running task, potentially running for many days/years without noticeable by the users interruption of service.
The goal of Visor.jl in my case is to have a lot of free time for enjoing reading about the investigations of John Rebus instead of investigating and debugging fatal crashes of h24 enterprise systems written in Julia
I tried to look for a Python implementation, was expecting since the protocol quite old (but it’s seemingly unavailable, should I read something into that, the protocol not used much or mostly/only with .NET?). You’re saying you plan one for Python, or you mean something else in relation to Rembus.jl?
[EDIT: I keep one link, because of the reply, but this is an unrelated project…] I did not know of the protocol even if it’s from 2012, and updated to version 2 in 2015:
At the moment there is a landing page giving some high level context for the project and the code snippets that demonstrate how to exchange a DataFrame and a dictionary between components written in Julia and Python.
Firstly - this looks like a great addition to the Julia libraries- kudos.
One question on scanning your comparison table: how have you implemented gRPC and is this as both a client and a server? I was under the impression it required quite and uplift on http2 before being viable.
Also - how would you call an rpc from another language (to your goal of a microservice approach)? Or is this what you announced above as an initial python integration?
Rembus implements the Remote Procedure Call (RPC) communication pattern as the Google project gRPC does (don’t be confused by the similar names between the RPC acronym and gRPC Google framework that implements RPC communication style).
They are two incompatibles middlewares each implementing the Remote Procedure Call pattern, for example:
gRPC requires a schema compiler and utilizes Protocol Buffers as serialization protocol.
Rembus is a schema-less RPC (in the sense that it does not require a schema language and a third part compiler) and it uses CBOR as serialization protocol.
At the moment Rembus License is AGPL but in the future the licencinsing
model may be changed, let’s see how things evolve.
Thanks for clarifying this. I did not mean to imply that Rembus would be an implementation for a gRPC server, but given that we don’t have a gRPC server implementation in Julia, having an alternative protocol (provided there will be implementations in other languages as well) would also be great. It would definitely make it easier for Julia code to be incorporated into existing architectures.
Really intriguing work! Incidentally, Rembus fulfills a need in a project of mine that has, so far, been unresolved; I’d also like to put my two cents in regarding the license, I would be quick to integrate this if it were to have a more permissive license (but will be checking it out regardless).
Regarding Visor, however, I see that it’s MIT licensed and I’m very excited to integrate this into my work, thank you very much for the great package and documentation!
I have made attempts at utilizing Julia for a process-based distributed system, but haven’t had success up to this point. Actors.jl has some very good work and come closest to the process philosophy I’ve looked for but contributions have slowed, and its API is very thorough but also complex. And then there is Nats.jl which integrates the NATS messaging system into a Julia package. While NATS is an impressive communication system, there is still a gap in Julia’s ecosystem, that of a robust task supervisor.
Visor appears to most closely adhere to the Erlang-esque process management philosophy, and recent history seems to suggest that Erlang has the philosophy to beat. I’m pleased with the simplicity of the integration with Julia and the simplicity of the API. Thank you again for the great work, I’m excited to try it out!
It looks really cool! I am currently working on a distributed system (as a hobby project) consisting of Redpanda, Kafka Connect, TimePlus and / or QuestDB/Arroyo, which may be further accompanied by RelationalAI and Oracle databases. This project is Julia-centric, with most of the analytics and decision processes taking place in Julia. I am wondering, would there be a place for Rembus.jl in such a project? In Julia, I plan to do most of my work using DataFrames. Additionally, I am curious about how Rembus compares to a framework like Fluvio.io. I have read that Fluvio has a total cost of ownership (TCO) that is about 50 times better than Kafka, which is prompting my question about Rembus wrt such a metric.
It seems that Rembus fits well with your project, so I encourage you to give it a try.
Your feedback would be valuable!
About the comparision, in a few words:
Rembus is written in Julia, Fluvio in Rust.
Rembus has 3 github stars, Fluvio 2.6k stars so consider you an early adopter and please be kind with Rembus .
Rembus does not have in-line computation as Fluvio but ETL tasks are to be performed by the application.
Fluvio is a datastreaming solution, Rembus enable datastreaming and also request-reply communication patterns.
I don’t know how to connect Julia applications with Fluvio (searching for examples I’ve not found anything), but with Rembus there is a simple API that make possible to setup a distributed system in short time.
Finally there is also a feature in Rembus that I think is pretty unique, correct me if I’m wrong:
The components may use different transport protocols:
WebSocket
TCP
ZeroMQ
and may talk to each others despite the transport protocol each one is using.
The reason for this complication is due to the fact that each transport protocol has characteristics that make it suitable for different types of contexts.
Take this last feature an experimental, let’s see if it will be useful to be consolidated.
Thank you for the kind words. I do not consider myself an expert in this field, so please consider this conversation more as a short stream of kind questions then a feedback. What I cant quite understand currently wrt Rembus is how I can get my data into Redpanda? Are you implying that I should use RDkafka.jl? I am currently feeding the system via WebSocket connections handled by MigratoryData Kafka edition. There is no Julia API for MigratoryData, so I wrote a short Python script that is a) making a WebSocket connection and b) in case json messages are transported as arrays, passing each as a separate json message and at the same time doing very basic parsing. Would it be possible to do such a thing with Rembus? Would Visor.jl be able to maintain WebSocket connection?