I’ve got the following in one of my pkgs. I put a print before and after. But one in 100 run it just stops there and does not print the second time. I’m kinda clueless how to debug this problem even more and fix it. Seems like correct Julia code for me. (PS: I don’t get the issue with the example code in the repl)
julia> l = ReentrantLock()
ReentrantLock(nothing, Base.GenericCondition{Base.Threads.SpinLock}(Base.InvasiveLinkedList{Task}(nothing, nothing), Base.Threads.SpinLock(0)), 0)
julia> d = Dict()
Dict{Any, Any}()
julia> lock(l) do
d[:key] = :value
end
:value
Here is what I would try assuming a problem in my code(and it would be a pain in the …): first I’d try to make the problem more reproducible mainly increasing the CPU workload artificially (eliminating file I/O and network I/O, increasing number of threads etc.). Then I’d try to produce a trace of the multi threading activity (spawns, joins, locks, channel read/write). With a bit of luck one of these traces could reveal a deadlock…
There could be chance that you are encountering a Julia problem, but even in this case the steps above could help finding a reproducer…
The strange part about it. Maybe I did something wrong.
if Base.trylock(active_bars.lock)
active_bars.dict[id] = bar_manager
unlock(active_bars.lock)
else
error("Not able to lock")
end
# lock(active_bars.lock) do
# active_bars.dict[id] = bar_manager
# end
l = ReentrantLock()
retries = 5
while retries > 0
if Base.trylock(l)
unlock(l)
break
end
sleep(1)
@info "retrying"
retries -= 1
end
if retries == 0
@error "deadlock?"
end
Couple of more questions:
do you see any Julia activity in your system monitor when the problem occurs?