@threads for loop performance

ankit13 · December 11, 2020, 12:05am

Hi all,

Why my code is running slower when I use more number of threads in @threads for loop? run

I would like to know, if this is a standard behavior for threaded parallelization?

This is a sample code that will reproduce the behavior.

function check(n )
    y = [one( Array{ComplexF64}(undef,2,2) ) for i=1:n]
    t = [ zero( Array{ComplexF64}(undef,2,2) ) for i=1:n ]

   Threads.@threads for i=1:n
        for j=1:n
            for k=1:n
                t[i] += y[k]
            end
        end
    end
end

@time check(400)

Thanks.

Oscar_Smith · December 11, 2020, 12:14am

Can you post a minimal working example of the code? Otherwise, it’s hard to help.

ankit13 · December 11, 2020, 12:32am

I did not post it because it is quite big. However, I can explain it here. Overall what I am doing is following: I have an 1D array (length 10000) of 2x2 complex matrices and I apply some functions on theses matrices using the threaded for loop where each thread takes one element of the 1D array.

Elrod · December 11, 2020, 12:52am

How many physical cores does your computer have?

I’d also use SMatrix from StaticArrays.jl for 2x2 complex matrices.

ankit13 · December 11, 2020, 1:01am

I have 28 cores. I will try static arrays.

Jan_Kybic1 · December 11, 2020, 8:57am

In your case, the problem seems to be related to your use of an Array of 2D matrices, which leads to a lot of allocations. Using .+= reduces allocations by doing the update in place. Using 3D matrices as follows

    y = ones(ComplexF64,(2,2,n))
    t = zeros(ComplexF64,(2,2,n))

is 10x faster on my computer and I get an additional speed-up from multithreading.

Interestingly, I also found recently that the performance deteriorated when I added Threads.@threads to my main for loop. In my case, the solution was to switch off multithreading in BLAS by calling BLAS.set_num_threads(1). From that point on, I started to see a speed-up.
It would be useful if we could at least get a warning in such cases.

Yours,

Jan

ankit13 · December 11, 2020, 11:57am

Okay, I will try. Thanks for the help.

Topic		Replies	Views
Loosing performance with `Threads.@threads` for loop Performance parallel , multithreading , threads	10	704	October 7, 2021
Threads.@threads does not work properly General Usage	6	305	July 7, 2024
Question for lower performance by using @threads in for loop New to Julia question	13	1054	July 9, 2021
Fully parallelized for-loop becomes slower with more threads Performance	2	423	June 9, 2021
Large allocations using @Threads.threads in a loop leads to slow down New to Julia multithreading	17	994	August 19, 2023

@threads for loop performance

Related topics