I have an application where I want to launch two Julia functions as separate threads. Those threads have a shared matrix they each read from and add extra columns to. Any ideas as to how to go about this?
One major concern I had was something I read in the documentation (see the end of this post) saying that OS level threading is not supported - all tasks are just switched between on the main thread. This is not what I want, particularly because my application is compute-bound.
An approach I’ve been looking at is using remotecall (or @spawnat or @spawn) and fetch, but the call to fetch is blocking, which is not what I want. Do I need to use @async with the fetch call? Also, once one thread is completed, I don’t need the results from the other thread, so I don’t want to have to wait for both threads to complete with @sync.
Another approach I saw would be the LocalCluster manager. This looks like it would be closer to what I want (if the condition variables talked about are analogous to those for p-threads), but I haven’t found any examples of how to use this.
I’ve also looked at using the Julia MPI package. Can the MPI package launch Julia code on separate threads if used locally? Or does it only work with a cluster of machines?
And now it seems like there is an experimental implementation of threading mentioned in the documentation but of course there aren’t any examples of using @threadcall. And it seems to have the same problem where it’s not actually launching new threads, just switching between multiple tasks on the same thread.
Then there are Tasks (aka coroutines) which also seem to just be switching on the same thread.
Finally, I saw a SharedArray class which looked like it would be helpful, but I found no way to add extra rows or columns to it. Is there a way to add extra rows or columns to a SharedArray and have the changes be reflected on all the threads?
What are your suggestions to approaching this problem? And please provide a small example if you can. Thanks!
Relevant Documentation:
All I/O tasks, timers, REPL commands, etc are multiplexed onto a single OS thread via an event loop. A patched version of libuv (Welcome to the libuv documentation — libuv documentation) provides this functionality. Yield points provide for co-operatively scheduling multiple tasks onto the same OS thread. I/O tasks and timers yield implicitly while waiting for the event to occur. Calling yield() explicitly allows for other tasks to be scheduled.
@async is similar to @spawn, but only runs tasks on the local process. We use it to create a “feeder” task for each process. Each task picks the next index that needs to be computed, then waits for its process to finish, then repeats until we run out of indexes. Note that the feeder tasks do not begin to execute until the main task reaches the end of the @sync block, at which point it surrenders control and waits for all the local tasks to complete before returning from the function. The feeder tasks are able to share state via nextidx() because they all run on the same process. No locking is required, since the threads are scheduled cooperatively and not preemptively. This means context switches only occur at well-defined points: in this case, when remotecall_fetch() is called.