Appending arrays in @threads

drhandwerk · May 31, 2020, 2:12am

What’s the correct way to achieve three arrays of size 10 related to the following?

A = []; B = []; C = [];

Threads.@threads for i = 1:10
   append!(A,i)
   append!(B,i)
   append!(C,i)
end

The above either works, doesn’t append all 10 values to all three arrays, or appends #undef to a location. The order they are appended doesn’t matter.

pixel27 · May 31, 2020, 2:22am

append! isn’t thread safe, so the result is really undefined. I believe this will work, you will use 3 threads to populate A, B, and C at the same time.

Threads.@threads for a in [ A, B, C]
    for i in 1:10
       append!(a,i)
    end
end

drhandwerk · May 31, 2020, 3:19am

Thanks for the response. My actual code isn’t as simple as appending the same number to 3 different arrays.

function buildAdjacency!(g::Graph)
n = length(g.nodes)
Is = []
Js = []
Vs = []
Threads.@threads for e in g.edges 
	# add edge to A
	append!(Is,findfirst(isequal(e.first), g.nodes))
	append!(Js,findfirst(isequal(e.second), g.nodes))
	append!(Vs,e.weight)
end
g.A = sparse(Is,Js,Vs,n,n)
end

Is there a thread-safe way to parallelize the loop over the edges?

jling · May 31, 2020, 6:01am

if you only have 3 accumulators (I J V), then the 3 threads approach still works

ppalmes · May 31, 2020, 7:59am

use ReentrantLock: Multi-Threading · The Julia Language

pixel27 · May 31, 2020, 11:21am

For this, what might be easiest is:

function buildAdjacency!(g::Graph)
    n = length(g.nodes)
    Is = []
    Js = []
    Vs = []
    @sync
        Threads.@spawn for e in g.edges 
            append!(Is,findfirst(isequal(e.first), g.nodes))
        end
        Threads.@spawn for e in g.edges 
            append!(Js,findfirst(isequal(e.second), g.nodes))
        end
        Threads.@spawn for e in g.edges 
        	append!(Vs,e.weight)
        end
    end
    g.A = sparse(Is,Js,Vs,n,n)
end

That way each thread is adding to a different array. This works best if the calculations each thread has to do are different (you can’t really reuse any intermediate calculations between the threads).

The ReentrantLock @ppalmes suggested would also work. It does impose some overhead so it’s effectiveness will be based on how long findfirst takes to execute (in this example)…if it’s short then there could be lots of blocking, if it’s long things will run fine.

Edit: I think the @sync will work here, but I’m not 100%. You might need to save the the t1, t2, t3 tasks then do a wait on them just before the sparse function. like:

t1 = Threads.@spawn for end
t2 = Threads.@spawn for end
t3 = Threads.@spawn for end
wait(t1); wait(t2); wait(t3)

ppalmes · May 31, 2020, 11:27am

it is always a trade-off so better you benchmark it to see which ones are faster depending on data size and how often different threads access the same memory. @distributed can be faster so you may also compare @threads with it.

ppalmes · May 31, 2020, 11:31am

don’t use the lock during find, only during append so that the lock can be released immediately.

Topic		Replies	Views
Question about the multithreading usage General Usage	6	142	June 6, 2023
Simple multi-thread loop with array Performance question , parallel , multithreading	11	731	April 13, 2021
Threads/Parallel New to Julia	22	8691	October 24, 2017
Parallel computing of matrix with @threads General Usage parallel , multithreading , threads	3	552	August 4, 2022
‌Basic question about Threads.@threads and multithreading New to Julia multithreading	1	344	February 2, 2021

Appending arrays in @threads

Related topics