Help implementing task parallelism in Julia

Channel-based design is good for something like this, but it’s based on a bit nuanced reasoning. Channel (queue) can become a bottleneck with high-contention and so I usually recommend against it when you can use the divide-and-conquer pattern. It’s also possible to introduce deadlock if you are not careful (e.g., there’s some feedback loop). However, Channel is great in terms of hackability since you can isolate parallelizable components and tweak each part without breaking others. It’s also easy to tweak the scheduling specifics. There’s also RemoteChannel in Distribtued.jl so scaling it out with adding multiple machines is conceptually straightforward. This can also help even with a single machine if GC becomes the bottleneck.

If all tasks are updating the matrix and dictionary all the time, it’s a bad idea. But if they are accessed somewhat rarely, lock is totally OK. I’m also working on concurrent data structures like lock-free ConcurrentDict which can be useful for sharing data efficiently. But concurrent data sharing is tricky to do it correctly and remembering “Don’t communicate by sharing memory, share memory by communicating.” is always useful.

1 Like