Raising a flag for specific condition with multithreading

Gregstrq · May 4, 2021, 5:15pm

Suppose that I have a bunch of threads that process array of some data (apply some function elementwise). Suppose also, that for some entries in array, the processing may fail (for concretness, we can assume that function returns nothing or missing in this case). After the array is processed, I would like to quickly surmise whether there were any fails or not.

Practically, I could have a flag variable — boolean variable, which is set by default to zero. Should any of the threads encounter a fail with processing, it would set this flag variable to one. After processing is finished, one can inspect the flag to see if any fails occurred or not.

Since several threads may try to set this flag simultaneously, one should prevent the racing condition by adding a lock to this flag variable. On the other hand, it does not matter which thread sets the flag first, and waiting till the lock is released slows the threads.

Is it safe to use this kind of flag variable without a lock? May be there are some special constructions in base Julia which exist specifically for this purpose?

affans · May 4, 2021, 5:17pm

Not an expert on this by any means, but that seems like a reasonable solution. Something like a SharedArray that is visible to all threads. Looking forward to some other solutions that may be posted here.

Jeff_Emanuel · May 4, 2021, 5:24pm

I try to avoid shared memory solutions any more. I would have the threads write a failure message to a Channel instead.

tkf · May 4, 2021, 7:58pm

I agree with Jeff_Emanuel that Channel-based solution would be cleaner and easier to get right. If you don’t need to preserve the order of input data, I recommend this as the first solution.

But, to directly answer your question, you can create failed = Threads.Atomic{Bool}(false) and then set it via failed[] = true in any thread without data races. This is much faster than using a lock.

If you want to terminate the processing as soon there is a failure, you can use TakeWhile transducer:

julia> [1, 2, 3, 10, 4, 10] |>
           Map(x -> x == 10 ? nothing : x) |>
           TakeWhile(!isnothing) |>
           tcollect
3-element Vector{Int64}:
 1
 2
 3

FLoops.jl supports break in parallel loop if you need more control.

Gregstrq · May 4, 2021, 8:29pm

By the way, what is the difference between locks and atomic variables?

tkf · May 4, 2021, 8:43pm

Typically, atomics are directly mapped to hardware instructions and can only deal with a small amount of data (a few bytes). Locks are built on atomics to support arbitrary objects and code.

Gregstrq · May 4, 2021, 8:48pm

Thanks for the information!

Out of curiosity: if we have a data race, but the outcome does not depend on the order of the operations, what will happen if no atomics or locks are used?

foobar_lv2 · May 4, 2021, 8:54pm

The race condition is benign: Multiple threads can try to change the flag value from zero to one, but your threads don’t care about the flag value, and you don’t care which thread set it, and you don’t want to read the flag value until execution flow has joined again.

For performance reasons, I’d use something like flag = Ref(false); failure_condition && !flag[] && (flag[] = true).

Again, these data races are all benign; but if the failure condition is triggered often, then you don’t want all threads writing into the same cache line over and over again (because that is slow).

tkf · May 5, 2021, 12:34am

It’s impossible to answer what will happen. My guess is “nothing interesting, probably equivalent to the code using atomics most of the time.” But the point is that we can’t answer this question. Data races imply undefined behavior. You can’t trust the output of such a program.

This is not what C/C++ memory model says. Since Julia is based on LLVM which is designed for C/C++, it’s the best model for reasoning about the code. It says program that mutates Ref(false) concurrently is broken. I recommend using relaxed memory order for what you are doing. In x86 it probably generates the same instructions as the one using Ref anyway.

However, using relaxed ordering in general is tricky due to problems such as out-of-thin-air. I find https://www.youtube.com/watch?v=cWkUqK71DZ0 and the accompanying paper highly informative for understanding how to use relaxed atomics reasonably safely.

Topic		Replies	Views
Avoiding Race Condition using Channels General Usage question	6	728	July 5, 2022
Data Races General Usage multithreading	6	1224	August 5, 2020
How to re-use threads? General Usage question	19	560	February 20, 2023
Lock in multithreading New to Julia	1	233	October 28, 2022
Use Dynamic Arrays (Vectors) during Multi-threading Julia at Scale multithreading , array , memory-allocation	7	2038	September 10, 2018

Raising a flag for specific condition with multithreading

Related topics