Threads.@threads does not work properly

CaG21 · July 6, 2024, 4:21pm

Hi everyone,

for the first time I am trying to exploit multi-threading in Julia. However, it seems I am doing some rookie mistake, since it does not work. I am running this very simple code (not a MWE)

t = 0.0
ℓ = 1000
ψ = randn(6, ℓ)

f = zeros(6, ℓ)

Threads.@threads for j in 1:ℓ
      f[:, j] = computeFreeDynamics(ψ[:, j], p, t)
end

where p is a structure of parameters and computeFreeDynamics is a “slow” function computing the free dynamics of a system.
I benchmarked the code with and without Threads.@threads and there is basically no change. However, if I try to use the same computeFreeDynamics function as RHS of an ODEProblem to run a MonteCarlo, Threads.@threads works as expected.
Do you have any idea what could be the problem?

adienes · July 6, 2024, 4:26pm

make sure you start julia itself with multiple threads, e.g. like julia -t8 or julia --threads=auto

CaG21 · July 6, 2024, 4:28pm

Yes, Julia is started with julia -t8 and, in fact, Threads.nthreads() gives 8 as output

Oscar_Smith · July 6, 2024, 4:32pm

Is computeFreeDynamics doing lots of matrix multiplications? if so, you will likely want to BLAS.set_num_threads(1) since otherwise all your cores will be used for the matrix multiplication, removing the ability to scale.

carstenbauer · July 6, 2024, 6:04pm

This allocates a new array for every ψ[:, j] and should be a @view. Moreover, you likely want to modify f[:, j] in place in computeFreeDynamics.

gdalle · July 7, 2024, 10:31am

In general, there is no guarantee that multithreading will speed up your code by the number of threads, or even speed it up at all. The best way to improve performance is by focusing on the sequential implementation, ensuring type-stability and in particular reducing allocations. Can you share more details about computeFreeDynamics?

sgaure · July 7, 2024, 12:58pm

There’s little to go on in your example. As noted, if you’re doing some heavy lifting with linear algebra, it’s probably already parallelized.

There’s nothing wrong with the loop as such, except the allocations noted above. I.e. ψ[:, j] allocates a vector, and computeFreeDynamics also allocates a vector for return. Allocations in parallel programs can be a performance problem. But it’s possible to write fortran in any language, and make a pre allocated version computeFreeDynamics! so that the loop becomes

@threads for j in 1:ℓ
    computeFreeDynamics!(@view(f[:, j]), @view(ψ[:, j]), p, t)
end

You can of course keep an allocating version around as something like

computeFreeDynamics(args...) = computeFreeDynamics!(zeros(6), args...)

Topic		Replies	Views
Parallel computing of matrix with @threads General Usage parallel , multithreading , threads	3	554	August 4, 2022
‌Basic question about Threads.@threads and multithreading New to Julia multithreading	1	350	February 2, 2021
Loosing performance with `Threads.@threads` for loop Performance parallel , multithreading , threads	10	703	October 7, 2021
Threads.@threads not working General Usage	7	1974	March 22, 2019
@threads for only using master thread? General Usage multithreading	7	1021	November 24, 2019

Threads.@threads does not work properly

Related topics