Appending arrays in @threads

What’s the correct way to achieve three arrays of size 10 related to the following?

A = []; B = []; C = [];

Threads.@threads for i = 1:10
   append!(A,i)
   append!(B,i)
   append!(C,i)
end

The above either works, doesn’t append all 10 values to all three arrays, or appends #undef to a location. The order they are appended doesn’t matter.

append! isn’t thread safe, so the result is really undefined. I believe this will work, you will use 3 threads to populate A, B, and C at the same time.

Threads.@threads for a in [ A, B, C]
    for i in 1:10
       append!(a,i)
    end
end

Thanks for the response. My actual code isn’t as simple as appending the same number to 3 different arrays.

function buildAdjacency!(g::Graph)
n = length(g.nodes)
Is = []
Js = []
Vs = []
Threads.@threads for e in g.edges 
	# add edge to A
	append!(Is,findfirst(isequal(e.first), g.nodes))
	append!(Js,findfirst(isequal(e.second), g.nodes))
	append!(Vs,e.weight)
end
g.A = sparse(Is,Js,Vs,n,n)
end

Is there a thread-safe way to parallelize the loop over the edges?

if you only have 3 accumulators (I J V), then the 3 threads approach still works

use ReentrantLock: Multi-Threading · The Julia Language

For this, what might be easiest is:

function buildAdjacency!(g::Graph)
    n = length(g.nodes)
    Is = []
    Js = []
    Vs = []
    @sync
        Threads.@spawn for e in g.edges 
            append!(Is,findfirst(isequal(e.first), g.nodes))
        end
        Threads.@spawn for e in g.edges 
            append!(Js,findfirst(isequal(e.second), g.nodes))
        end
        Threads.@spawn for e in g.edges 
        	append!(Vs,e.weight)
        end
    end
    g.A = sparse(Is,Js,Vs,n,n)
end

That way each thread is adding to a different array. This works best if the calculations each thread has to do are different (you can’t really reuse any intermediate calculations between the threads).

The ReentrantLock @ppalmes suggested would also work. It does impose some overhead so it’s effectiveness will be based on how long findfirst takes to execute (in this example)…if it’s short then there could be lots of blocking, if it’s long things will run fine.

Edit: I think the @sync will work here, but I’m not 100%. You might need to save the the t1, t2, t3 tasks then do a wait on them just before the sparse function. like:

t1 = Threads.@spawn for end
t2 = Threads.@spawn for end
t3 = Threads.@spawn for end
wait(t1); wait(t2); wait(t3)

it is always a trade-off so better you benchmark it to see which ones are faster depending on data size and how often different threads access the same memory. @distributed can be faster so you may also compare @threads with it.

don’t use the lock during find, only during append so that the lock can be released immediately.