trying to get a better understanding on Julia’s parallelization possibilities I heard about the awesome
@spawn macro which is new in Julia 1.3
Looking more into the available tools I am confused about the difference between
What is a typical usecase for each? What may be anit-patterns respectively?
Distributed is for running computation on workers, i.e., separate instances of julia running either locally or remote. Threads are local to your CPU. Threads have less overhead, but require you to think about thread safety. Distributed computing scales to many machines, clusters, the cloud etc., but communication between the main process and the workers can limit the performance for some tasks and code must be available on all participating machines etc.
I hope that gives you an idea.
Threads are used for stuff like
- Matrix multiplication
- Speeding up loops involving somewhat expensive computations.
Distributed for stuff like:
- Many parallel MC simulations that take long time each
- Very expensive computations that take very long time in relation to how long time it takes to send data to the worker.
- Very large, distributed linear algebra or optimization
Please consider adding a paragraph to the very beginning of the docs https://docs.julialang.org/en/v1/manual/parallel-computing/ explaining this (pretty much what you wrote above).
I think it would help users unfamiliar with these concepts navigate this chapter of the docs better.