Local Parallel Processing Benchmark Advice

iwelch · April 24, 2018, 10:43pm

ok, I spent a few days benchmarking various versions of local multiprocessing under julia 0.6.2, with 4-core and 8-core intel processors. The results are at the rear of http://julia.cookbook.tips/doku.php?id=parallel . roughly speaking, Threads work wonderfully as long as the function needs very little memory. Threads can deteriorate badly when each function call needs a good chunk of memory. Threads then turn worse than single-processing, which is understandable. What is less understandable is that threads then also turn worse than pmap. I am surmising that the OS has a better scheduler for such situations than Julia-internal. So, if you need to remember one thing:

use Threads if your function and its memory easily fit into the L1 cache.

the next important aspect to remember is that pmap deteriorates badly (3-4 orders of magnitude) with short function calls.

Never use pmap if you have many short function calls.

@parallel is useless, but not murder—like a factor 2 rather than a factor 1,000 slower than sequential processing. Finally, @parallel seems good for middle-of-the-road tasks, with modest memory needs and a medium amount of overhead (functions not too many and too brief).

The number of CPU or threads choice can be tweaked, but its importance pales in comparison to the advice above. Good rules of thumb (not a perfect rule) are:

threads: use the maximum number (lower of CPU threads and function invokcations to be run).
workers: use the maximum number of cores plus 2 as your number of workers

/iaw

anon94023334 · April 25, 2018, 3:21am

You might take a look at Proposal: Make creation of CachingPool a default for pmap · Issue #21946 · JuliaLang/julia · GitHub which describes the primary issue I think you’re seeing.

iwelch · April 26, 2018, 2:32pm

indeed. but difficult and useless to check further until this is changed. for now, it is what it is.

Topic		Replies	Views
Multithreading and pmap Julia at Scale	8	2735	January 5, 2019
When to use pmap vs Threads.@threads? Julia at Scale jump , parallel , multithreading , pmap	11	3394	December 16, 2021
Parallel Good Practice Julia at Scale	22	3949	November 30, 2018
Lack of improvement from distributed pmap, understanding a simple example New to Julia distributed , pmap	6	155	October 29, 2024
Correct way of parallelizing on a HPC remote cluster machine Performance question , hpc , parallel , distributed , threads	8	1282	August 25, 2020

Local Parallel Processing Benchmark Advice

Related topics