Let’s say I’m trying to run the following function many times on multiple cores:
@everywhere function test() X = randn(800, 800) Y = randn(800, 800) Base.LinAlg.BLAS.axpy!(2.0, X, Y) end
(The real function is vastly more complicated but also dominated by a BLAS call).
If I start up Julia with the some number of worker processes and run
julia> pmap(x -> test(), 1:length(workers()))
it appears to me from the CPU scaling that pmap is contending with the threads BLAS is using to run apxy!.
Even if I start up Julia with a single worker process, my eight-threaded Intel Core i7 appears to show 4 threads being used. This is also true after running
How do I spawn worker processes that won’t be competing with BLAS for resources?