Coming in late here, but I think this is a textbook case of what the docs for GC.safepoint
are talking about. The solution is to insert a safepoint within the loop in f_wait
(not within test
), as follows:
using Base.Threads
using ProgressBars
function f_wait(a, b)
while !a[]
GC.safepoint()
end
return a[] && b[]
end;
function test(n, x = Atomic{Bool}(false), y = Atomic{Bool}(false))
for i in ProgressBar(1:n)
x[] = y[] = false
t_wx = @spawn f_wait(x, y);
t_wy = @spawn f_wait(y, x);
x[] = y[] = true
wait.([t_wx, t_wy])
end
return true
end
test(100000)
In more detail: The problem is that f_wait
has a potentially infinite loop with no allocations, IO, or task switches, hence no implicit GC safepoints, thus blocking GC for a potentially infinite time. Meanwhile, your test
function performs an allocation when it creates the t_wy
task, and this happens after t_wx
has been scheduled, but before its termination condition x[]
is set to true
. Whenever this particular allocation triggers a GC run, you have a deadlock—the main thread is waiting for every other thread to reach a safepoint so the GC can do its sweep, while t_wx
is waiting for the main thread to set x[]
to true
, never encountering a safepoint during the wait. The solution is to introduce a safepoint explicitly as shown.