Use of CachingPool's for a pmap that will be run many times

tictaccat · October 9, 2021, 4:12am

Hi! I’m writing a custom linear map using LinearMaps.jl whose matrix-vector product I would like to embarassingly parallelize. The point is that this linear map will be used tens of thousands of times during an iterative solver, so I would like to speed-up each matrix-vector product.

I’ve read that CachingPool can be very useful for speeding up pmap, as using the default WorkerPool can lead to a lot of overhead. See, for example, this thread: https://github.com/JuliaLang/julia/pull/33892. However, my question is: since the pmap used in my matrix-vector product will itself be run tens of thousands of times, does this mean it would be highly beneficial to use the same CachingPool across all of these pmaps, rather than creating a new CachingPool each time I run pmap? I could maybe do this by including a CachingPool as part of the state of my linear operator. I’m a bit cautious of this because I don’t know if it will lead to any performance improvement (I don’t really understand Julia distributed), and it also makes the code a decent amount more complicated as I have to manage the state of a pool. What are your thoughts?

Topic		Replies	Views
How to use CachingPool? Julia at Scale	4	1703	December 10, 2018
How should I implement parallel maximum likelihood? General Usage	7	2022	April 4, 2018
Pmap performance regression: pmap(x->f(x,y), X) creates copies of y Performance	9	1159	August 30, 2018
How to avoid repeated data movement between processes? General Usage question	12	2142	March 29, 2018
Pmap slow compared to map General Usage performance , parallel	11	3045	September 25, 2018

Use of CachingPool's for a pmap that will be run many times

Related topics