However, using Threads.@spawn crashed Pluto/Julia the moment the button is pressed.
It runs ok when using @async instead.
Running the exact same code with threads.@spawn in VSCode does works.
For Threads.@spawn I also tried firing up Julia with 4 threads, Pluto with 2 threads (confirmed with Threads.nthreads() ==2), with same crash result.
Am I doing something wrong or is this a Pluto thing?
Is the use of @async acceptable?
MWE: A simple button with counter
using Base.Threads
using GLMakie
Threads.nthreads() # ==2 #not required for @async
GLMakie.activate!(inline=false, title="Pluto Graph")
# the inline not really needed but guarantees standalone window
fig2 = Figure(backgroundcolor=:maroon, size=(400,200))
display(fig2)
o_isrunning = Observable(false)
o_i = Observable(0)
buttonlabels=("Pause", " Run ")
o_buttonlabel = @lift string(buttonlabels[$o_isrunning+1],"\n",$o_i)
button = Button(fig2[1,1], label = o_buttonlabel )
GLMakie.on(button.clicks) do clicks
o_isrunning[] = !o_isrunning[]
if o_isrunning[]
#Threads.@spawn begin # Crashes
@async begin # Works
while o_isrunning[] && isopen(fig2.scene)
o_i[] = o_i[]+1
notify(o_i)
sleep(0.1)
#yield()
end
end
end
end
Start of (long) crash message when using Threads.@spawn (appears in Julia shell, userid redacted)
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ff900b00dc6 -- RegisterProcTableCallback at C:\WINDOWS\System32\DriverStore\FileRepository\iigd_dch.inf_amd64_b260c545909302e9\ig9icd64.dll (unknown line)
in expression starting at none:1
RegisterProcTableCallback at C:\WINDOWS\System32\DriverStore\FileRepository\iigd_dch.inf_amd64_b260c545909302e9\ig9icd64.dll (unknown line)
glBindBuffer at C:\Users\*****\.julia\packages\ModernGL\BUvna\src\functionloading.jl:73 [inlined]
bind at C:\Users\*****\.julia\packages\GLMakie\fj8mE\src\GLAbstraction\GLBuffer.jl:31
g
....
@async and @spawn both create tasks, but for spawn they are allowed to migrate between threads and for async not. So async leads to something called coroutines where multiple tasks run interleaved on the same thread. Makie is not threadsafe, its renderloop runs via @async and you will get crashes if you trigger stuff that updates OpenGL state in a different thread than the render thread.
But in general Makie’s concurrency model is not very principled right now, I couldn’t tell you what the “correct” way is to change plot objects asynchronously that’s guaranteed to work, it just empirically usually works fine to do things in @async blocks because of the threading behavior I mentioned above. Maybe @sdanisch or @ffreyer could chime in on this as they have much more insight into this part of Makie than me.
I’ve actually been puzzling about this, since @async definitely doesn’t need the same kind of thread safety, since green tasks cannot actually run at the same time.
So a simple example shows pretty clearly, that you can’t just swap @async with @spawn:
function count(n)
x = 0
@sync for i in 1:n
@async x += 1
end
return x
end
function count2(n)
x = 0
Threads.@sync for i in 1:n
Threads.@spawn x += 1
end
return x
end
count(10000) == 10000
count2(10000) == some random number
I have really no idea, why the warning is there without giving any such context.
As I mentioned the problem seems to be with Pluto + GLMakie + Threads, as the code using Threads.@spawn runs fine under VSCode.
I also realised that the (apparently?) only non-crashing way to access any ‘global’ variables under Pluto (i.e. from other cell) into the @async task is via an Observable. Which makes sense as Pluto’s core feature is to guarantee state consistency over the entire notebook.
This is true, but it is not specified anywhere that a task runs until an explicit wait, sleep, yield or some such call. In principle, a task switch can happen between the fetch and the store in x = x + 1. It doesn’t do that right now (at least not for standard types), but if you do something like x[i] = x[i] + 1, it can happen now, depending on the type of x.
That is, even in @async tasks one should take concurrency precautions.
It doesn’t do that right now (at least not for standard types), but if you do something like x[i] = x[i] + 1, it can happen now, depending on the type of x.
Do you happen to know if accessing observable value obs[] is a standard type (and should be safe) or at risk?
For posterity: I suspect the original problem with Threads.@spawn occurring in Pluto (only?) has something to do with Pluto issue 2779: “In Pluto’s source code we have lots of @async”
The @async code used here is still very bad. Just because you happened to not hit the racy condition in the call you wrote does not mean that it can’t happen.
julia> function count(n)
x = 0
@sync for i ∈ 1:n
@async begin
y = x + 1
yield()
x = y
end
end
x
end;
julia> count(10000)
1
Yield-points can be highly unpredictable, e.g. they can be hit when one of the tasks does IO, or if you hit a dynamic dispatch or any number of other non-visible things occur. It can also depend on stuff like the optimization level that julia was run with.
@async does not have significantly less thread safety needs than @spawn. If anything, it’s kinda worse in this regard because the problems can be harder to detect, so bugs can be hidden for much longer!
You’re right, but my point was, that in practice there are very few yield points in critical code (e.g. +(::Int, ::Int)), so it’s not a surprise to get segfaults when going from @async to @spawn
Even though this might be an indication, that the @async code was never fully safe.
And my point is that you can’t know if your code has yield points or not, so any code that’s written assuming there’s no yield points is at best a ticking time bomb.
Where yield points are and where they occur is not something you can reason about at a library level in the overwhelming majority of cases. It can depends on stuff like if the user set -O0, if debug warnings are enabled, and all sorts of other stuff.
I don’t mean to attack you or whatever here, I just find it really concerning to see false and dangerous claims like this being made publicly on Discourse where people might get wrong ideas.
Well, I on the other hand need to explain why they cant just swap out @async with @spawn every couple of weeks, just because the docs indicate you should always
Thanks guys, point taken. But… for a rather newby (esp compared to you fellas); taking above MWE, where/how should one secure that code to avoid conflicts?
(BTW: Sofar my full code “seems” to run ok, but instability risk taken onboard!)
Maybe a good solution to avoid races is to use the ticks observable which fires when the scene is about to be rendered. Whatever you do synchronously in a callback there should not be able to mess with data in the renderloop, as it the ticks callback is invoked synchronously inside the renderloop anyway.