Multithreading for nested for loops

Yes, and quite significantly. This is something many users have had to learn the hard way. Hopefully, the issue will be less severe starting with Julia 1.10, but here’s a summary of my current understanding as a regular user, as of Julia 1.9:

Every time your code creates a new array or otherwise allocates new memory, it first asks the garbage collector (GC) whether it’s time to sweep up objects that are no longer in scope and release their memory. If the GC says yes, the sweep is performed before your code can resume. Such a check-in with the GC is called a safepoint (safepoints are inserted in a few other cases as well, and you can insert one manually by placing GC.safepoint() in your code). If only a single thread is running, that’s all there is to it.

However, if multiple threads are running and one of them hits a safepoint where the GC decides to do a sweep, execution will wait until all other threads also hit safepoints before the sweep is performed and execution can resume. In other words, the GC needs to “stop the world” in order to perform a sweep, and it can’t just stop the threads in their tracks, it must wait for each of them to check in at a safepoint. Thus, if the number of threads is large, they may end up spending most of their time waiting on each other.

(Moreover, if some threads allocate and others don’t, the wait can be indefinite and potentially lead to a deadlock, see Multithreaded program hangs without explict GC.gc() - #5 by danielwe. The solution is to insert safepoints by hand in the non-allocating tasks.)

If you have many long-running tasks that need to allocate memory but don’t need to access each other’s memory, and you have a large number of CPU cores to parallelize across, it’s often more efficient to use multiprocessing (e.g, Distributed.jl) rather than multithreading.

2 Likes