Please help with my first research paper (on distributed computing, implemented in Julia)

I am finishing my very first research paper at the age of 42 :slight_smile: , with the help of a friend.

It is about automatically restructuring complex, long-running multi-threaded/distributed computations on the fly, to minimize communication costs.

We have used the actor model to describe the optimization I developed, although it may be useful in other concurrency models too. The reason is twofold: First, I am only familiar with the actor model. Second, I feel it - maybe ignorantly - superior to all the others. In the introduction of the paper, we try to explain this second reason.

It would be very helpful to get some review:

1. Introduction and related work

Actor-based concurrency models [1] have been used for decades for scalable distributed applications [11]. Actors - the primitives of concurrency - encapsulate their state, communicate through asynchronous messaging and form arbitrary topological relations. Various frameworks and languages permit actor programming, including Akka [15], CAF [7] and Pony [10]. Applications include banking and telecom transaction processing, complex event stream processing and large-scale analytical pipelines. The concurrency model of microservice architectures [8] corresponds with the actor model, and actor frameworks can be applied directly in cloud environments (e.g. Orleans [4]).

Driven by the popularity of cloud and Internet of Things (IoT) solutions and the stagnating performance of single CPU cores, the last few years has seen an increased interest in actor systems.

We believe that actor systems also have a great potential for artificial intelligence, by providing an efficient tool to incorporate sparsity into deep learning.

1.1. Why actors?

Programs built using other programming models - especially the synchronous ones - may be easier to reason about, but the actor model allows unlimited scaling and a variety of performance optimizations thanks to a few key properties:

    1. No shared state: An actor can access only its own state directly, and everything else must be done through messaging. Shared state is an abstraction famous for introducing hard to find bugs called data races in concurrent programs. Actor programming does not expose the programmer to the risks of
      shared memory, leaving shared memory to automatic performance optimizations.
    1. No global synchronization mechanism included: Synchronization must be implemented on the actor level, using the fact that message processing of a single actor is serializable.
    1. Location transparency: The act of sending a message does not depend on the
      location of the target actor - sending messages within a machine is the same as between machines.

Global synchronization performance degrades as the physical diameter of the system grows, because information cannot travel faster than light. Similarly, providing the illusion of synchronous shared state - which does not exists in reality - is only possible with introducing a latency proportional to the diameter of the subsystem containing the state. Not having these features allows the actor model to scale arbitrarily without performance loss.

The third property, location transparency, allows the execution environment to optimize actor placement and message passing during run-time without actors noticing it.

Do you think after reading this that the actor model may be the right way to go when unlimited scalability is the goal? And would you continue reading?

Thanks in advance!

(Any help with the full text would also be very appreciated, if someone is interested in the topic and feel like reading the 8+ pages)


I neither read it completely nor carefully, but I feel that a table comparing the amount of communication and also the runtime before and after optimization would be good :slight_smile: Just that you can show to the dumb of us: “Look, we are 111% better than before!!!”. Also the figure descriptions could be a bit more informative (axis=scheduler?).

1 Like

Yeah, I felt to do that a bit unfair, as I am the author of the software, and I may have - even unconsciously - tuned it in a way that exaggerates the results. I wanted to avoid that, especially because I can demonstrate 5200x speedup. :slight_smile:

Now I also understand the major part of your suggestion: A table or diagram is something you can find without scanning through the text for numbers, so including it may “convert” you to read it.

Those numbers are less distorted, I will draw a plot.

Thanks again!

1 Like