[ANN] YAActL 0.2, Yet Another Actor Library (built in Julia)

I am happy to announce YAActL, a new actor library for concurrent programming in Julia.

An Actor is a computational entity that, in response to a message it receives, can concurrently:

  • send a finite number of messages to other actors;
  • create a finite number of new actors;
  • designate the behavior to be used for the next message it receives.

See: The Actor Model on Wikipedia.

YAActL is

  • available in the Julia registry,
  • based on the Actor model,
  • builds on Julia’s multiple dispatch and
  • Task, Channel and Threads.@spawn and
  • has also a little support for Distributed.
  • Actors follow a messaging protocol for dispatching their behavior functions and
  • got an Erlang/OTP-inspired API.
  • YAActL has a nice documentation.

Since it is quite new,

  • a supervisory tree and
  • a generic server behavior

are not yet implemented and will come in the next steps.

I think actors are a useful addition to Julia’s ecosystem. Please look at YAActL and tell me what you think. I would be happy if you try it out and find it useful.

I hope to get some feedback from you, help and/or collaborators sharing my excitement about actors.

10 Likes

This is great!

Have you seen the Swift concurrency manifesto ? Might have some interesting ideas for this space. Also leans heavily on actor stuff

1 Like

Thank you for the link. The paper is interesting and has a lot of references. I’m surprised to hear that Swift doesn’t have something like async/wait as basic concurrency primitives (the paper is from 2017).

But I share very much the general direction of the paper:

We are focusing on task based concurrency, not data parallelism.

There should be more work done on concurrency, understood as “composition of independently executing processes” (see: Rob Pike’s presentation Concurrency is not parallelism)

Fortunately Julia has primitives for asynchronous programming: Task, @async, Channel, Threads.@spawn and the like. But working with those directly sometimes gets a little clunky.

Therefore I think that an API could be helpful. I don’t know if YAActL is the best approach, but it builds on ideas from Erlang/OTP as something time-tested. And you can do the stuff Rob Pike talks about, quite well with it.

There are other ideas around: from Akka/Scala, Go, Pony … some cited in the manifesto. Let’s play with them!

3 Likes

Btw, I created and registered Actors.jl (https://github.com/oschulz/Actors.jl) a long time ago, but it it never got very far (it was before Julia had multi-threadable tasks).

If someone is interested, the package name is up for grabs (the last version was pre-Julia-v1.0), it could be replaced by a new actor package under the same name. Just let me know.

8 Likes

Do you plan to restart with working on actors?. What do you think of YAActL?

There is also Richard Palethorpe / Actors.jl · GitLab, but doesn’t seem to be active now.

Other languages have several actor libraries and frameworks. and many of them seem to be based on community efforts (e.g. Rust’s Actix). I think that a community effort for a Julian actor library should take Actors.jl.

Maybe/hopefully YAActL.jl will evolve to it. After learning more about Erlang I plan to implement an actor supervision tree next. I welcome any help.

2 Likes

There’s also Luvvy Luvvy - Use Actors liberally for robust, highly parallel Julia code (prototype) .

I think that a community effort for a Julian actor library should take Actors.jl .

I fully agree. Personally, I like the actor model very much, but won’t be able to do anything in that direction in the foreseeable future. So if there’s a community push for a nice fully-fledged actors package I’ll be glad to let them have Actors.jl.

1 Like

Actors.jl used to be Luvvy and Luvvy is now a web framework I only experimented with. I’m happy for the community to take Actors.jl if their is some momentum building around another project of course. OTOH there are a bunch of very different paths you can take with an Actor model library, so perhaps projects are more correct to use names like Actix or Akka.

I do think Julia is a really great fit for the actor model, but there are some low level issues to sort out around channels and Tasks (well, at least there were, I haven’t had much time with Julia recently). Also I wonder weather Distributed could essentially have actor like features added to it?

There are some design decisions to take, amongst them:

  • should it build on Julia’s primitives (task and Channel)?
  • how is the message processing done (e.g pattern matching, multiple dispatch …)?
  • should it be transparent between Threads and Distributed?
  • should it implement behaviors and how?
  • how should actors be isolated from one another?

I think many of those considerations are well described in Joe Armstrong’s dissertation: Making reliable distributed systems in the presence of software errors.

Armstrong gives a reason for doing it at all (p. 19):

In Concurrency Oriented Programming the concurrent structure of the program should follow the concurrent structure of the application. It is particularly suited to programming applications which model or interact with the real world. …

Therefore I think, the design decisions in an actor library must be taken such to facilitate and to give good support for concurrency.

I agree!

1 Like

should it build on Julia’s primitives (task and Channel)?

I can answer fairly certainly: yes with one exception.

I think it is acceptable, in fact necessary, to temporarily implement an alternative Channel (like I was considering using liburcu’s queue which I did in another Actors library I created libactors; Actor model and message passing in C with Userland RCU) to experiment with disruptive changes. However this should be upstreamed to Julia’s core where it will benefit a lot more than just actors.

I guess for playing with alternate tasks/thread models you would have to fork Julia itself as there are some locks dotted around in the C code that need to be taken care of. However in any case you don’t want to be carrying those changes forever in a third party library if the core library has something like Task deeply wired into it.

how is the message processing done

Well my library abuses multiple dispatch, but also allows pattern matching. You can definitely have both here, it’s just a case of giving the user hooks lower into the stack.

https://palethorpe.gitlab.io/Actors.jl/reference/#Actors.listen!-Tuple{Scene}

Implementing Pony style behaviors should be possible, but I suppose it requires some fancy code generation which might make it a little magical.

should it be transparent between Threads and Distributed ?
how should actors be isolated from one another?

Now these are really difficult ones! Utimately running on a different CPU core with shared memory and running different computers are very different things. It might not be a good idea to create the illusion they are the same thing.

I guess this list could be greatly extended. I still have some ideas about chaining messages to control nondeterminism, logging and such which I haven’t had chance to touch.

I think many of those considerations are well described in Joe Armstrong’s dissertation: Making reliable distributed systems in the presence of software errors.

Thanks, I will add that to my reading list.

1 Like

Location transparency is a key feature of the actor model. It allows the programmer to build the software out of small concurrent pieces (actors) without worrying about hardware. And it allows the runtime to optimize actor placement using knowledge about actual hardware.

Without location transparency, I do not see much gain with actors over async/threads/distributed.

2 Likes

Location transparency enables non-local error-handling. From Armstrong’s dissertation (p. 40):

When we make a fault-tolerant system we need at least two physically separated computers. Using a single computer will not work, if it crashes, all is lost. The simplest fault-tolerant system we can imagine has exactly two computers, if one computer crashes, then the other computer should take over what the first computer was doing. In this simple situation even the software for fault-recovery must be non-local; the error occurs on the first machine, but is corrected by software running on the second machine.

On the other hand local computations (multi-threading) are much more efficient for lightweight tasks. Therefore we must be able to decide if an actor should run locally or remotely but then communication between them must be transparent.

1 Like

I think there is more to the actor model than just location transparency, but maybe for practical usage it is what really matters. At any rate, you have to decide what happens when a user passes a pointer to mutable data in a message. If you want true transparency then you have to make all messages immutable or copy-on-write and prevent the user from passing references to local resources.

Frankly you can’t actually enforce that and there will be some crazy user (the cluster manager for a start) who wants to pass file descriptors (for example) in messages if they know the other actor is local to the machine or pointers/references if it is in the same thread. They may also want to create actors which are as local to the current one as possible or the opposite which is perfectly valid for performance or reliability respectively.

All I am saying here is that you can decide to have an API which allows the user to find out the locality of another actor and take advantage of it or you can not. On the happy path the locality of other actors doesn’t matter, so in that case you have full transparency, which I am not against.

Therefore we must be able to decide if an actor should run locally or remotely but then communication between them must be transparent.

Yes, but additionally you perhaps want a way to mark message types (or any type) as being local only or maybe requiring some processing before being transmitted remotely. For example if it contains local timestamps, these may need to be converted, which could be done transparently at a high level, but the user needs some way of implementing the conversion routines to make it transparent. I’m sure there is plenty of other stuff as well.

2 Likes

For sure, there are state-machines, servers, event-managers, agents … Actors have polymorphism and can represent all sorts of interacting concurrent entities. Location transparency is just one characteristic.

This is what Armstrong strongly recommends. If we don’t want it in a Julian library – mostly for performance reasons –, we have to tackle some complexity as you describe.

More examples:

  • if a local actor (with a local channel) has a request from a remote actor, it cannot send it its local channel (a mutable variable) for responding but has to create a remote channel and a forwarder actor to forward the response to the local channel.
  • how do we handle communication between (local) actors each on different nodes?
  • If we extend location transparency, actors from different concurrency oriented languages (COPLs) should be able to communicate. How do we handle that?

At the moment I don’t know if an API can handle all such cases. They keep popping up. If not – maybe – we have to follow Armstrong’s recommendation.

1 Like

Here’s an update to the swift concurrency manifesto, with the concrete plan. Leverages actors and structured concurrency :

Edit: Here’s the zoom in on actors https://github.com/DougGregor/swift-evolution/blob/actors/proposals/nnnn-actors.md

1 Like

Very interesting. When I first saw in their proposal …

actor class BankAccount {
  private let ownerName: String
  private var balance: Double
}

I felt that they are missing the point. After thinking some time about it I can explain this feeling and why I think we can do better.

Basically the question is what an Actor should be in a programming language among the other existing language constructs like Task, Function, Channel, Type, Class … or what it should represent. Gul Agha wrote:

An actor may be described by specifying:

  • its mail address, to which there corresponds a sufficiently large mail queue; and,
  • its behavior, which is a function of the communication accepted.

[Gul Agha: Actors, A Model of Concurrent Computation in Distributed Systems; MIT press, 1986, p 24]

And then he gave the following picture (p. 26):

An Actor is different from a Task since it has a mailbox (e.g. a Channel) and a behavior (a Function f(c) of the communication accepted). It can create Tasks or Actors. Therefore the main difference from a Task is that an Actor dispatches its behavior only after having received a message. (So far no talk about classes, data, inheritance … at all.) The next question is what happens when an Actor has processed a message. The replacement is deliberate: the actor can maintain or switch its behavior or hand over the communication to a newly created actor machine or simply stop.

The Swift proposal binds actors to objects with encapsulated functions. This may be consequential for an OO language. But …

I think that a Julian approach is to associate and start an Actor with a behavior Function. This returns a Channel representing the actor. The actor listens (as a Task) to its channel and dispatches its behavior with an incoming message. Then there must be functions like become for specifying the replacement (e.g. switching the behavior).

We have all the needed ingredients to implement that and it is more straightforward and aligned with the Actor model. This makes Julia

Finally:

  • reading such proposals in a foreign programming language is interesting since it makes one think and go back to the drawing board,
  • I agree with most what they write about basic actor isolation but I am sceptical about their “Full Actor Isolation”. It will be interesting to see what they come up with.
2 Likes

I felt that they are missing the point

Yes, I like that you are taking this back more to the formal Actor model specified by Gul and Co. I wasn’t sure quite how far to take that so my library doesn’t have any concept of behaviors and state transitions. I’m not sure that ERlang really leverages the early research on the Actor model either.

Speaking of ERlang, I like the concept of protocols in Armstrong’s thesis. This sounds somewhat similar to idea I had which I call ‘dialogs’. Which are essentially contracts specifying how one or more actors interact during a sequence of message exchanges. These could specify what state(s) an actor should be in after receiving a message, what messages it may send in response and timeouts. These would be written separately from the actors’ internal logic, so the user can specify on a high level how actors should behave separate to the implementation.

The number one issue I have had constructing actor systems is debugging the scenario where all messages stop, but I’m expecting the system to do something. Often I have a chain of actors, so it is not enough to simply expect one actor to respond to another. Then when other people come to read my code they have to piece the resulting system behavior together from the individual message handler code (unless I write a high level description).

An Actor is different from a Task since it has a mailbox (e.g. a Channel ) and a behavior (a Function f(c) of the communication accepted).

Something else which I find interesting about this is the possibility to process messages to a single actor in parallel. Essentially once an actor has decided whether or not a message should update its behavior (or state) it may start processing the next message in parallel. It is only when a behavior change is required that it should wait for all parallel messages to finish processing before continuing. This means that you don’t need N or more actors to take advantage of N processors.

Glad you found it helpful.

You might also be interested in the ensuing discussion(s):

Also regarding actor isolation, there’s a new interesting proposal by Chris Lattner, the creator of swift, to amend the actor proposal:

Here are some related Julia efforts for the future regarding immutable state that may be helpful to keep in mind:

And finally, a discussion on structured concurrency:

and the relevant swift proposal:

3 Likes

I find it interesting that many of those “old” ideas have not been fully exploited. Erlang was created at about the same time. So its not surprising that the behavior concept is not leveraged in it. But now we can try to exploit it.

Yes, communication must be constrained so that actors can operate as state machines. YAActL actors follow a message protocol that can be extended by the user. The API simply is an interface to the protocol.

Yes debugging them is tricky. Have you noticed that the Erlang/Elixir people have graphical tools for it like Appmon, Pman, a process oriented debugger … The interesting thing is that debugging actor systems will become much more easier than debugging concurrent programs built on concurrency primitives. I think with actor monitors, ties and supervision in YAActL 0.3 this will already improve and be demonstrable:

Yes, as shown in Gul’s picture there are to ways for that: an actor can spawn tasks or it can start further actors. A typical application is a parallel server: when it receives a request, it starts another actor to handle it. Then it immediately returns to its message queue to look for the next request. An actor can also start parallel computations.

Thank you for mentioning that. It links to Structured Concurrency. This suggests that a task must finish before its parent task finishes:

Structured concurrency means that lifetimes of concurrent functions are cleanly nested. If coroutine foo launches coroutine bar , then bar must finish before foo finishes.

This is not structured concurrency:

unstructured concurrency

This is structured concurrency:

structured concurrency

Even if I agree with the diagnosis (go statements are harmful!), I have some reservations regarding the cure:

With actors there is no such restriction of their lifetime. An actor A may simply start another actor B and then finish. Actor B then outlives A. Supervision of B can still be done by linking B deliberately to other actors C and D. When one of them exits abnormally, B is taken down as well or vice versa. One of them can be made a supervisor … What I just described is Erlang/Elixir-like error handling and I feel it to be much stronger than the libdill approach.

I would suggest that we don’t take the latter too seriously (e.g. by enforcing it). Still there is the need to have powerful methods for writing fault tolerant concurrent programs. I feel that for that one can build on the Erlang approach.

Very stupid question from me - in the Actor models is there fault tolerance?
I ask because int he era of exascale computers there might be failures during a model run.
Can the model continue to run and perhaps repeat the compute portion on a n actor which has not failed?