Suppose I have a reference that’s shared across two threads.
One of the threads only sets the reference, and the other only dereferences it. Does this need a lock around the reference? On setting, on dereferencing, or both?
What about if either thread can set it and dereference it? Same questions.
In fact, what are the exact rules for when a lock is needed?
I think I have quite a few experience on this topic.
First of all, you’ll have to distinguish the difference between “mutation” (e.g. “pop!”) and “re-bind” (e.g. setfield!). The latter is a bit more tricky.
If there is only one threads doing setfield! to a reference while there are multiple other threads reading it, you may have the desirable result where you either read the old or the new but not an intermediate invalid result. In this situation you may get away with not adding locks, but you’ll have to test the feasibility on your specific machine. Here is an example I devised
import Random.seed! as seed!
N::Int = 999
function gen(cnt)
if cnt % 2 == 1
seed!(1)
return rand(Int128, N, N)
else
seed!(3)
return rand(Int128, N, N)
end
end;
function check(r)
s = getfield(r, :x)
if s == gen(0)
println(0)
elseif s == gen(1)
println(1)
else # I cannot see this in practice
println("\tEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE")
end
end;
function test(r)
for c = 1:10000000
for j = 1:14
Threads.@spawn check(r)
end
println("c=$c")
setfield!(r, :x, gen(c))
end
end
test(Ref(gen(0)))
As a rule of thumb, try it on your machine. If you find that the resulting behavior is acceptable or as expected even if you didn’t add locks, you can persevere (without locks).
Base collection types require manual locking if used simultaneously by multiple threads where at least one thread modifies the collection (common examples include push! on arrays, or inserting items into a Dict).
So you do need a lock in this case.
In general, the Julia compiler will try to optimize your code assuming there is only one thread, unless you use locks, channels, or atomics. I found https://assets.bitbashing.io/papers/concurrency-primer.pdf to be helpful for understanding atomics.
The first response to your question is not correct. Especially this advice:
As a rule of thumb, try it on your machine. If you find that the resulting behavior is acceptable or as expected even if you didn’t add locks, you can persevere (without locks).
Race conditions often don’t appear during testing, or appear only very sporadically due to the interactions with the environment. It’s not sufficient to check a few times and move on.
In fact, what are the exact rules for when a lock is needed?
In all the cases you’ve described, I would recommend a lock.
In the simplest case where one thread writes and the other reads, it is possible that the reader may read the memory mid-write, in which case you risk reading an intermediate, garbage value. When there are multiple writers, the situation is compounded.
The main alternative to a lock is the use of atomics which is an option when writing to a limited set of supported primitive types, see here: Multi-Threading · The Julia Language