Race condition when writing the same value in a parallel loop

james3 · August 3, 2023, 9:58pm

I have a parallel loop similar to the following

Threads.@threads for (aₓ, bₓ) in collect(Iterators.product(1:N.A, 1:N.B))
	abₓ = joinAb(aₓ, bₓ)
	output1[abₓ], V = myfunc1(abₓ)
	output2[aₓ, bₓ] = myfunc2(V, aₓ, bₓ)	
end

So, in my parallel loop I am iterating over aₓ and bₓ. I combine (aₓ, bₓ) to form the index abₓ. However, multiple values of aₓ and bₓ may map to a single abₓ. Is it safe to assign a value to output1[abₓ] in my parallel loop? In a serial loop, it would just overwrite with the same value, but I am unsure what this would do if two or more threads were attempting to assign the same value to output1[abₓ] simultaneously. Thanks.

Ronis_BR · August 3, 2023, 11:40pm

It will lead to a race condition. However, since every thread is writing the same value (I suppose myfunc1 does not depend on a global state), you should be fine. IIUC, it can be classified as a non-harmful race condition You are only seem to be computing unnecessary information and decreasing your performance.

Ronis_BR · August 3, 2023, 11:42pm

function test()
       a = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3]
       b = [0,0,0]

       Threads.@threads for i in a
           b[i] = i
       end

       return b
end

julia> test()
3-element Vector{Int64}:
 1
 2
 3

julia> test()
3-element Vector{Int64}:
 1
 2
 3

julia> test()
3-element Vector{Int64}:
 1
 2
 3

julia> test()
3-element Vector{Int64}:
 1
 2
 3

nsajko · August 4, 2023, 1:22am

Julia’s manual seems to suggest that there are no harmless data races in Julia (like in other languages):

You are entirely responsible for ensuring that your program is data-race free, and nothing promised here can be assumed if you do not observe that requirement. The observed results may be highly unintuitive.

This is a language design choice that most languages make because it is important for enabling competitive performance, even though it’s not very “user-friendly”.

EDIT: Julia doesn’t have a worked-out memory model yet, so there is no clear answer to this question yet. See here. So it’s currently best to assume there are no benign data races, just to be safe.

algunion · August 4, 2023, 1:57am

In your particular scenario, assuming myfunc1 is a pure function, assigning the value in your parallel loop would be safe (e.g., from the point of view of correctness). However, this is an empirical/anectodal conclusion - please see the @nsajko post for the more unpleasant picture

Now, if myfunc1 is also computationally expensive, you perform duplicate work and waste resources/time.

One way to avoid doing duplicate work would be to hide the myfunc1 calls and output1 updates beyond a Channel/Task that keeps track of whether myfunc1 was already called using a specific abₓ value: in this way, you would ensure that you only spawn tasks that are going to do non-duplicate work and at the same time you have the guarantee to avoid data-race conditions altogether.

The above approach assumes that your myfunc1 function is expensive enough to be worth paying for the additional overhead involving spawning new tasks.

Topic		Replies	Views
Multi-threading changing results New to Julia	18	3715	August 21, 2020
Data Races General Usage multithreading	6	1239	August 5, 2020
Multithreading a over a double for-loop with arrays Performance parallel , multithreading	6	212	June 23, 2025
What is the mechanics of `copy` in the thread-parallization by `Threads.@spawn`? Or: Problems with copy arrays in thread parallelism General Usage parallel , multithreading , threads	1	266	October 8, 2023
Mutable object in parallel loop General Usage	1	462	October 21, 2017

Race condition when writing the same value in a parallel loop

Related topics