Is `@async` memory safe?

heliosdrm · August 17, 2021, 3:37pm

The page of the manual on multi-threading warns about memory-unsafe operations in data race conditions, and gives the advice of using locks or atomic operations on Threads.Atomic objects to work around that problem.

I get that this applies e.g. when you @spawn different tasks that may write and read the same data in different threads. But does this also happen for “green threading” tasks, when scheduled with @async? The section on asynchronous programming that goes right before multi-threading makes no mention of such problems, so my impression is that this problem is specific of “real” multi-threading, but I have the gut feeling that it’s not like that.

jling · August 17, 2021, 3:37pm

@async only has 1 thread running so it’s thread-safe by construction I believe.

heliosdrm · August 17, 2021, 3:40pm

So, if you have multiple available threads, and schedule two tasks with @async, are they both guaranteed to run in the same thread? Which one? The one in which they were scheduled, maybe?

jling · August 17, 2021, 3:42pm

julia> a = Int[]
Int64[]

julia> @sync for _ in 1:10
           @async for _ in 1:10
               push!(a, Threads.threadid())
               sleep(0.01)
           end
           @async for _ in 1:10
               push!(a, Threads.threadid())
           end
       end

julia> all(==(1), a)
true

ericphanson · August 18, 2021, 1:00am

Xref

c42f · August 18, 2021, 6:45am

Yes, this is correct if you’re worried about data races which are only a concern for true parallelism.

You may, however still need locks if there’s any chance that a Task (green thread) might yield to another task which manipulates the same data. In Julia it’s not easy to audit possible yield points, so I’d generally suggest locking around any shared data structures which aren’t explicitly documented as threadsafe. (This will also help you later, if you decide to use Threads.@spawn rather than @async.)

tisztamo · August 18, 2021, 6:50am

I cannot grasp this, could you please elaborate a bit? (e.g. What counts as “maniupulating the same data”?)

tkf · August 18, 2021, 7:48am

Consider the following program:

using StructArrays

function f(a, b)
    v = StructVector((a, b))
    @sync begin
        @async push!(v, (1, 2))

        x = @inbounds v[end]
        @show x
    end
end

Is this program well-defined for f(::Vector, ::Vector)? Likely yes. (But there is no public API guaranteeing this is definitely the case.)

Is this program well-defined for f(::AbstractVector, ::AbstractVector)? No. This is because push!(::AbstractVector, 1) may contain an I/O; e.g., reading a file, accessing internet, or even just a @debug statement. In such case, the scheduler may decide to switch to the parent task while executing push!(a, 1) or push!(b, 2) in the child task. Now, in the parent task, the invariance of StructVector (e.g., the underlying arrays have consistent size) can completely be violated and no API can be expected to work.

The semantics of @async and @spawn tasks allow arbitrary interleaving of “pieces of code.” The only difference is that the granularity of the “pieces of code”. In @async, it’s the parts of the code delimited by the yield points (“I/O” operations). In @spawn, it’s individual machine instructions. So, this is why people tend to worry about the correctness of concurrent programs only when @spawn is used. More interleavings are possible and so it increases the possibility that bad things happen. However, since there is nothing in Julia language that makes sure certain function/method call does not contain a yield point [1], it is not correct to assume that manipulating a data structure from multiple @async tasks is safe, unless you read all the code reachable from the functions you are calling. (I guess it sounds rather like a FUD but I feel it is necessary for clarifying the current situation of concurrent programming in Julia.)

[1] In contrast, many other languages such as Python, JavaScript, C++, Rust, … that added coroutines after the language is designed tend to have async/awit keywords which clarify when the coroutine may yield to the scheduler. They have another issue with it though.

tisztamo · August 18, 2021, 10:29am

Thanks for this great explanation!

This seems like a source of rare but hard to find bugs. A quick mental search in my recent programs did not found any possible instance of this, but it seems entirely possible, especially with @debug. Shared memory is evil.

c42f · August 18, 2021, 1:11pm

This is a nice picture to have in mind for @spawn, though to connect to the original question about data races — you need the program to be free of data races in order to interpret the execution as a sequentially consistent “interleaving”.

tkf · August 18, 2021, 10:43pm

We shake hands with the devil, and get the performance

Yeah, good point. I avoided mentioning memory ordering since I realized my comment was getting long. But I agree that it’s worth mentioning that using @spawn is hard since even mere interleaving is not automatic.

dlakelan · August 19, 2021, 2:31am

As someone who was a BeOS developer, I strongly disagree. The important thing is to understand what you are doing. Message passing through channels and using synchronization primitives is really not that hard. In many ways multithreaded programming is easier, it just isn’t like what people are used to. A good threading tutorial would be helpful I think.

c42f · August 19, 2021, 6:26am

Haha! There’s even some algorithms which intentionally accept data races when some randomness is acceptable. It seems to work well in machine learning with Hogwild! and Hogbatching - https://arxiv.org/pdf/1106.5730.pdf https://people.ece.ubc.ca/matei/papers/ipdps16.pdf

tkf · August 19, 2021, 7:14am

Yeah, a similar idea is called chaotic relaxation and apparently, you can track back the idea to 60s (!). Though the C++ committee says, even with the C++20 memory model:

In theory, these algorithms are subject to OOTA and RFUB behaviors, however, in
practice, current implementations avoid such behaviors.

P2055R0: A Relaxed Guide to memory_order_relaxed
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2020/p2055r0.pdf

OOTA: out-of-thin-air; RFUB: read-from-untaken-branch

Topic		Replies	Views
Using Tasks 101 Teaching & Outreach tutorials , task	15	906	July 30, 2024
Dicts in different @async blocks New to Julia question	8	519	April 23, 2020
Threads, synchronization, consistency New to Julia	8	628	November 2, 2023
Task scheduling semantics Internals & Design	6	1252	August 12, 2021
@async macro not recommended? General Usage	5	427	April 3, 2024

Is `@async` memory safe?

Related topics